Posts

Get Apache Wayang ready to test within 5 minutes

Image
Hey followers, I often get ask how to get Apache Wayang ( https://wayang.apache.org ) up and running without having a full big data processing system behind. We heard you, we built a full fledged docker container, called BDE (Blossom Development Environment), which is basically Wayang. Here's the repo:  https://github.com/databloom-ai/BDE I made a short screencast how to get it running with Docker on OSX, and we also have made two hands-on videos to explain the first steps. Let's start with the basics - Docker. Get the whole platform with: docker pull ghcr.io/databloom-ai/bde:main At the end the Jupyter notebook address is shown, control-click on it (OS X); the browser should open and login you automatically: Voila - done. You have now a full working Wayang environment, we prepared three notebooks to make it more easy to dive into. Watch our development tutorial video (part 1) to get a better understanding what Wayang can do, and what not. Click the video below: 

Combined Federated Data Services with Blossom and Flower

Image
When it comes to Federated Learning frameworks we typically find two leading open source projects - Apache Wayang [2] (maintained by  databloom ) and Flower [3] (maintained by  Adap ). And at the first view both frameworks seem to do the same. But, as usual, the 2nd view tells another story. How does Flower differ from Wayang? Flower is a federated learning system, written in Python and supports a large number of training and AI frameworks. The beauty of Flower is the strategy concept [4]; the data scientist can define which and how a dedicated framework is used. Flower delivers the model to the desired framework and watches the execution, gets the calculations back and starts the next cycle. That makes Federated Learning in Python easy, but also limits the use at the same time to platforms supported by Python.  Flower has, as far as I could see, no data query optimizer; an optimizer understands the code and splits the model into smaller pieces to use multiple frameworks at the same ti

Compile Apache Wayang on Mac M1

We release Apache Wayang  v0.6.0 in the next days, and during the release testing I was wondering if we get wayang on M1 (ARM) running. And yes, a few small changes - voila! Install maven, scala, sqlite and groovy: brew install maven scala groovy sqlite Download openJDK 8 for M1: https://www.azul.com/downloads/?version=java-8-lts&os=macos&architecture=arm-64-bit&package=jdk  and install the pkg.  Get Apache Wayang either from  https://dist.apache.org/repos/dist/dev/wayang/ , or git-clone directly: git clone https://github.com/apache/incubator-wayang.git Start the build process: cd incubator-wayang export JAVA_HOME=/Library/Java/JavaVirtualMachines/zulu-8.jdk/Contents/Home mvn clean install Ready to go: [INFO] Reactor Summary for Apache Wayang 0.6.0-SNAPSHOT: ... [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time:  06:24 min After the build is done the binaries are located in mavens home: ~/.m2/repository/o

Why we use ELv2 for infinimesh operator

(originally written @ medium ) The infinimesh project chair has decided to use the Elastic License v2 ( ELv2 ) for infinimesh’s operator, effective per today (Aug 15,2021). Unaffected, infinimesh is and will be open source and stay with ASL v2. Open Source, and software et all, brings multi billions of revenue, adds the same mount as savings to enterprises, and feeds the whole digital service industry. I love open source, and I contributed to various projects, newest one is Apache Wayang (incubating). The developer community around infinimesh see an increased use of our tech stack, and we appreciate that in our community. Open Source means sharing code, adapting, distribute and combine, build new features, fix bugs, collaborate. Originally initiated by Richard (Stallman) back in the 70s, as software was on-premise or run on mainframes. These days mostly everything runs in a cloud, and we have managed to make an entire AIoT platform portable cloud native. Means, our stack can run every

Stream IoT data to S3 - the simple way

Image
First, a short introduction to infinimesh , an Internet of Things (IoT) platform which runs completely in Kubernetes :  infinimesh enables the seamless integration of the entire IoT ecosystem independently from any cloud technology or provider. infinimesh easily manages millions of devices in a compliant, secure, scalable and cost-efficient way without vendor lock-ins. We released some plugins over the last weeks - a task we had on our roadmap for a while. Here is what we have so far: Elastic Connect infinimesh IoT seamless into Elastic . Timeseries Redis-timeseries with Grafana for Time Series Analysis and rapid prototyping, can be used in production when configured as a Redis cluster and ready to be hosted via Redis-Cloud . SAP Hana All code to connect infinimesh IoT Platform to any SAP Hana instance Snowflake All code to connect infinimesh IoT Platform to any Snowflake instance. Cloud Connect All code to connect infinimesh IoT Platform to Public Cloud Provider AWS, GCP and Azu

Scalable Timeseries with infinimesh IoT platform Timeseries Plugin

Image
infinimesh IoT platform is built to make data privacy and data ownership possible, without any doubt. To achieve this we have build our platform in an cloud-native way and fully API driven. This allows our customer to integrate our cloud into their systems, without compromising IT Security or even move to public cloud provider. We understand ourselves as a stretched workbench for any IoT related ideas our customers might have. In this blog post we show how easy infinimesh extends to an ultra-scalable time series, designed for high throughput and rapid prototyping, using the timeseries plugin   from their GitHub space. Note that the plugin uses Redis in the standalone mode, we urgently advise to setup a Redis cluster if you want to run the plugin in production!   The plugin comes as a ready-to-ship docker container, including Redislab TimeSeries and Grafana to enable easy PoC and Hackathons.   To get it running, you need to have either a working docker environment or kubernetes  insta