Posts

Showing posts from 2022

Get Apache Wayang ready to test within 5 minutes

Image
Hey followers, I often get ask how to get Apache Wayang ( https://wayang.apache.org ) up and running without having a full big data processing system behind. We heard you, we built a full fledged docker container, called BDE (Blossom Development Environment), which is basically Wayang. Here's the repo:  https://github.com/databloom-ai/BDE I made a short screencast how to get it running with Docker on OSX, and we also have made two hands-on videos to explain the first steps. Let's start with the basics - Docker. Get the whole platform with: docker pull ghcr.io/databloom-ai/bde:main At the end the Jupyter notebook address is shown, control-click on it (OS X); the browser should open and login you automatically: Voila - done. You have now a full working Wayang environment, we prepared three notebooks to make it more easy to dive into. Watch our development tutorial video (part 1) to get a better understanding what Wayang can do, and what not. Click the video below: 

Combined Federated Data Services with Blossom and Flower

Image
When it comes to Federated Learning frameworks we typically find two leading open source projects - Apache Wayang [2] (maintained by  databloom ) and Flower [3] (maintained by  Adap ). And at the first view both frameworks seem to do the same. But, as usual, the 2nd view tells another story. How does Flower differ from Wayang? Flower is a federated learning system, written in Python and supports a large number of training and AI frameworks. The beauty of Flower is the strategy concept [4]; the data scientist can define which and how a dedicated framework is used. Flower delivers the model to the desired framework and watches the execution, gets the calculations back and starts the next cycle. That makes Federated Learning in Python easy, but also limits the use at the same time to platforms supported by Python.  Flower has, as far as I could see, no data query optimizer; an optimizer understands the code and splits the model into smaller pieces to use multiple frameworks at the same ti