Monday, April 29, 2019

Enabling IoT to establish a sustainable value chain

I wrote an article for CIO Applications, here's archive of it:

IoT devices are getting more and more intelligent and can now create meshed networks by itself, switching from a sensor into an actor and transferring informations only for the meshed neighbors. For example a connected car could tell the future home that the homeowner will be at home in 5 minutes and the garage door and the door need to be unlocked in time, the lights need to be switched on and the grid operator needs to be informed that the wallbox now charges with 22KW. In near future this will happen over direct meshed information cells, operated by always connected devices, wearables, sensors, actors, mobile devices - short: everything. And all cloud provider offer dozens of solution to master the challenges, on the one, other or complete different way.

Self-organizing mesh networking and communication comes with a permanent flow of information, massive IoT data streams even classic Big Data frameworks like Hadoop cannot handle anymore in time. Coming along with the art of data, the need for data processing changes with the kind of data creation and ingestion. Most analyses will be done on the edge and during the ingestion stream when the data comes to rest. The data lake should be the central core to store data, but the data needs to get categorized and catalogued together with a proper and well defined schema and data description. The intended use of the gravity such data pools generate needs to be applied as the motor of data driven innovation.

Why? Batched processing helps to predict getting value out of stored data even by analyzing multiple other data points and storage facilities, but not to react in time. And timely information in IoT enables business processes only to have a valuable meaning at the time they occur, to do the job stream processing frameworks like Spark or Kafka are more suitable. Combining both techniques brings unmatched value and impact to the business, driven by the right use of data. Stream processing during the data transportation closes the gap between rapid data and data on rest. Mostly referring to the more costly IoT at edge computing, MQTT enabled stream processing engines deliver high throughput over all kind of compute instances, be it in a local data center, hybrid clouds or in public clouds.

The same is countable for available cloud technology. Every cloud provider has his own IoT solution zoo with his own lock-ins, but often they do not fit to scaling plans either in complexity, missing or not well implemented parts or simply the price model is not comparable to the margin getting from an IoT based product. A combined approach of scalable cloud technology (which fits most) and own development brings the most benefit at an affordable price tag, unspoken of the intellectual property a business gains and holds, instead to bring this to providers and therefore competitors. Independent organisations like “Linux Foundation Edge” provide the most useful insight over Open Source projects and initiatives.

Just dumping data somehow without visions behind does not help to solve the problems companies face on their digital journey, especially when it comes to questions of revenue from IoT projects. Big Data needs to have a nearly perfect data management, data rights and data retention process behind. Only this offers the possibilities to get full advantage of any kind of data, to open new revenues and sales streams and to finally see all data driven activity not as a cost saving project (as the most agencies and vendors promise) but as a revenue creation project. Using modern cloud technologies moves organizations into the data centric world, focusing on business and not operations.

Analyzing the data is the more tricky part here - on the one hand every data point brings valuable input, but on the other hand the unlimited data store also brings vulnerabilities to customer insights. I am a bit concerned about 360 degrees approaches. At first the value part of data collections needs to be questioned: which data is system relevant for support, maintenance or emergency and which is important to generate a sustainable revenue. Using streaming analysis gives valuable input at the point in time the information is needed to make decisions, but also gives the possibility to route data into different data stores. It is always unquestionable that the value of customers is higher than the data gathered, implementing a state-of-the-art data ethic catalogue is one of the main tasks analytics needs to cover.

We move quickly to a so-called interconnected world, always connected systems will dominate our future lives, introducing new business models by combining business areas which were not even in the range of combined business models. The future CIO needs to know what implications the data has, what uncountable values this data can generate but also to weight what threats uncontrollable data collections can cause. Building new data driven business will be the most exciting job in future, things never done before are now possible. Embrace this.

The article can be read online: 

Thursday, March 14, 2019

Infinimesh IoT / IIoT platform is starting up!

Today is a day we will never forget - infinimesh (https://www.infinimesh.io/) is starting and lifting off! Our Kubernetes, Apache Kafka ® and graph based Industrial IoT platform is entering the alpha stage! We have been working like maniacs over the past 14 months to bring a fully flexible, independent, patent and vendor lock-in free IoT platform to you! Soon it’s your chance to test and try it out, our closed alpha will be open for public on March 30, 2019 - Mark this date in your calendar!

An incredible platform comes to life

We believe smart and connected devices bring our society forward. Smart technology uses resources only when they are really necessary and thus prevents waste. On the other hand, when really required, smart things act and hence prevent accidents or simply enable a great user experience. We have started infinimesh 100% Open Source, without patents or closed software. Any software components we have developed, and to this we commit going forward, will be open - forever. Founded by engineers who built the backbone of the European Energy Revolution, infinimesh aims to make industrial and individual IoT secure, available and affordable for all. Infinimesh runs in all cloud offerings, be it public, hybrid or private. All you need is Linux; our platform works in any container environment as well as native.

Infinimesh on Google Cloud

We have selected Google Cloud as strategic partner for our SaaS offering - and from today on the platform is running on GCP! Our SaaS offering, running in Google Cloud, is free for everybody up to 25 devices - ideally for makers, startups and industrial Proof of Concepts. That leaves enough room to bring ideas to live and test even larger installations and use the feature rich ecosystem of GCP to make your idea a successful product.

What can I do with infimesh IoT on GCP right now?

  • Connect devices securely via MQTT 3.1.1
  • Transfer desired and reported device states
  • Manage accounts (Create/Delete)
  • Manage Namespaces to organize devices and restrict access to devices
  • Create hierarchically organized objects, e.g. buildings, rooms to organize and model device hierarchies

How does it work?

Our Kubernetes Operator does the work a real operator would do: it not only installs the whole platform, but also takes care of required cloud/datacenter resources, updates, monitoring and handles incidents like errors. It attempts to resolve as many issues as possible on its own, and notifies human operators when human intervention is required. The operator is the glue between infinimesh and the target installation environment. Our alpha drop focuses on Google Cloud Platform and enables exactly this environment. More supported environments will follow.

We build features for industrial IoT


Device Management

Powerful but simple framework to visualize clusters of devices within your organization and set permissions up to device level.

Device Shadow

Real-time and two-way correspondence for every device in your fleet. Our highly-scalable backend can power millions of devices.

Timeseries Visualization

Great telemetry is based on timeseries. infinimesh has timeseries data capabilities built-in and enables meaningful monitoring.

Virtual Twins

A virtual twin is the digital copy of your physical asset. infinimesh provides virtual twins which give you the possibility to modify your physical device without even touching it
Machine Learning and AI.

Intelligence

infinimesh has Machine Learning and Artificial Intelligence models built-in to rapidly detect anomalies and respond accordingly.

Roadmap and features ahead


OPC-UA with full open62541 support (binary protocol with encryption) and BACnet will be available within the next quarter.

OPC-UA is the leading semantic protocol for industry 4.0 and opens the full potential to industry proven stacks like Siemens MindSphere and IBM Watson for Industry. BACnet will also make its way into the platform quite soon, we expect a first drop in the next couple of weeks. BACnet is the most used communications protocol for Building Automation and Control (BAC) networks that leverage the ASHRAE, ANSI and ISO 16484-5 standard protocol and is used in various intelligent buildings as protocol stack.

What’s next?

More exciting news and announcements will follow in the next months, so use the platform and follow this blog or our channels to never miss news. We are happy to have you as user and customer and we will support you in any idea you have. Drop us a mail, or open a Feature Request (https://github.com/infinimesh/infinimesh/blob/master/.github/ISSUE_TEMPLATE/feature_request.md) or contact (https://infinimesh.io/contact.html#contact) us over our different channels - we are here.

Wednesday, November 22, 2017

Next Internet comes with IoT

The Internet we know is a great space for collaboration, social media and gaming. But when it comes to business or transactions, the power belongs to few big ones. Remember the S3 outage and half of the north-american services where offline? Or the Dny hack which kicked out half of the internet for hours? The next internet could be a blockchain based independent network, using as many protocols as available and there is no one person in control of it and it is run on the Internet.

In a nutshell, Blockchain is a decentralized system in which every transaction gets mathematically approved by the members of the system, therefore every member of that transaction knows about it. The information of the transaction is stored in the distributed servers of the blockchain. That makes manipulations highly impossible, and the transaction is also highly available at every time.

IoT devices are getting more and more intelligent and can now create meshed networks by itself, switching from a sensor into an actor and transferring informations only for the neighbors. For example to tell the doorknob that the Homeowner will be at home in 5 minutes with his EV, and the Wallbox and the door needs to be unlocked. Right now that is possible by IFTTT, which is an extra protocol and needs manual configuration, in future this will happen over direct meshed information cells automatically, inclusive status updates.

When we now look into the power of billions of IoT devices, may it be sensors, cameras, windmills, cars or whatever, as basis they all carry CPU and memory. Connecting all of those together combines a large, highly available inter-connected system. Always on, always accessible, always responsible self connected things which share informations about their environment with other things by itself and trigger automated actions, learned by the behavior of the things’ environmental space. Thinking as an ultrawide available blockchain, those devices will be the next internet. Transactions, informations and data will be stored securely on a device and every device, connected to another device, will automatically become a member of the global blockchain pool in the future. That brings the power of blockchain to an always connected network, speeding up the digital disruption every business has and allows enterprises to build models based on the decentralized network. Right now, without an economic virtual entity to establish each other's identity, over 2 billion of humans are excluded from being a part of any financial transaction globally and let others collect data about yourself, steal identities and commit fraud without letting us a chance to fight against. Those who have the power and control large parts if the Net can’t be disempowered, because they operate large parts of the Net, too.

That mistake can and will be solved by the next Internet, bringing in radical and new solutions for the Internet we know. Most of them are based on Blockchain Technology, like Ethereum provides for Smart Contracts.

Another technology move could be blockchain powered AI, immutable, shared decentral control, trusted audit trails leads to qualitative better data and algorithms through more data available. Since real-world modeling works on large volumes of data, such as training on large datasets or high-throughput stream processing systems. For applications of blockchain to AI, blockchain technology with big-data scalability and querying like the groundbreaking BigchainDB with the public IPDB are needed. And a global scaled blockchain unlocks new large-scale opportunities starting from better model training though model sharing over a shared global AI model registry to automated wealth for our planet.

Friday, June 16, 2017

The Machine and BigData

HP’s „The Machine“ (1) project is in my eyes the most advanced in the IT world with the simple goal to rethink the entire computer design. And the plan is ambitious – the first edge devices shall be ready in 2018, industrialized series in 2020.

Will “The Machine” really revolutionize an entire industry mostly influenced by IBM? Let’s say it could and probably will with a high percentage of success.
Based on the idea of Memristor (2) the project uses memory based technology to store data. Nothing new here. New is the non-volatile usage. Data, stored in an Memristor, persists unless the storing bit gets cleaned and new aligned. Now, NVRRAM (non-volatile resistive RAM) it’s faster as volatile DDR4 modules (which they use at the moment until Western Digital can deliver NVRRAM modules) and factor 100x faster than current state-of-the-art SSD based technologies. The newest prototype has 40 nodes with approx. 160 TB DDR4-RAM and 1,280 Cores connected with X1 PM’s (Photonic Modules). Means: pretty fast. Anyhow, just follow the appendix (1) to get more interesting engineering facts.
The most important consideration is the pure permanent all-integrated storage itself. The part of attached storage (like HDFS, GFS, Ceph) would simply disappear and directly merge with the computation layer. The principle “local data first” will surely be a part of any fine-tuning approach but with the high density of storage that will not really matter. All pieces of computation will be at the same place (cache, volatile and permanent storage combined with fast caching) and work as one homogenous entity which can hold every state of every piece of data during the whole computation lifecycle.
I just want to consider the changing fundamentals of that idea and what that would mean to data processing. The first big difference – a trinity memristor can store 10 bits instead of 8 today. That means simply a 3 times higher data storage density than today. Additionally, the highly volatile cache a CPU uses during the calculation process will be stored permanently which allows following processes to reuse the pre-calculated subsets and that would speed up any calculation dramatically. As for example in pattern detection algorithms like MCMC (3) could highly benefit simply by picking up the already calculated subset and use it in a new chain which would revolutionize data intelligence in terms of speed and tree generation. I think thats an huge step into the AI world - ultrafast learning algorithms helping the mankind to operate high sensitive environments like deep- space flights, connected cars, CEP networks or decentral power grids.

(1) https://www.labs.hpe.com/the-machine
(2) http://en.wikipedia.org/wiki/Memristor
(3) https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo

Tuesday, May 9, 2017

The next stage of BigData

Right now, the terms BigData and Hadoop are used as one and the same - often like the buzzword of buzzwords. And they sound mostly as a last time call, often made by agencies to convince people to start the Hadoop journey before the train leaves the station. Don’t fall into that trap.

Hadoop was made by people who worked in the early internet industry, namely Yahoo. They crawled millions of millions web pages every day, but had no system to really get benefit from this information. Dug Cutting created Hadoop, a Map/Reduce framework written in Java and blueprinted by Google in 2004 (1). The main purpose was to work effectively with an ultra-large set of data and group them by topics (just to simplify).
Hadoop is now 10 years old. And in these 10 years the gravity of data management, wrangling and analyzing runs faster and faster. New approaches, tools and techniques emerging every day in the brain centered areas called Something-Valley. All of those targeting the way we work and think with data.

That describes the main problem of Hadoop itself – it’s designed as an inner working system, providing storage and computation layer at once. And that’s why Hadoop Distributions typically are suggesting to use BareMetal installations in a Datacenter and push companies to create the next silo'd world, promising the good end after leave another one (separate DWH’s without connection between each other). That comes with dramatic costs, operations and a workforce of highly trained engineers, among high costs of connecting systems on premise to the new silo'd DataLake approach, often mixed up with lift-and-shift operations. And here arises the next big problem described as “data gravity”. Data simply sinks down the lake until nobody can even remember what kind of data that was and how the analytical part can be done. And here the Hadoop journey mostly ends. A third issue comes up, driven by agencies to convince companies to invest into Hadoop and Hardware. The talent war. In the end it simply creates the next closed world, but now named a bit fancier.

The world spins further, right now in the direction public cloud, but targeting device edge computing (IoT) and DCC (DataCenter on a chip). Additionally, the kind of data changes dramatically from large chunks of data (PB on stored files from archives, crawler, logfiles) to streamed data delivered by millions of millions edge computing devices. Just dumping data in a lake without visions behind getting cheap storage doesn’t help to solve the problems companies face in their digital journey.

Coming along with the art of data, the need for data analyzing changes with the kind of data creation and ingestion. The first analysis will be done on the edge, the second during the ingestion stream and the next one(s) when the data comes to rest. The DataLake is the central core and will be the final endpoint to store data, but the data needs to get categorized and catalogued during the stream analytics and stored with a schema and data description. The key point in a so-called Zeta-Architecture is the independence of each tool, the “slice it down” approach. The fundamental basic is the data centered business around a data lake, but the choice of tools getting data to the lake, analyze and visualize them aren’t written in stone and independent from the central core.

That opens the possibilities to really get advantage of any kind of data, to open new revenues and sales streams and to finally see all data driven activity not as a cost saving project (as the most agencies and vendors promise) but as a revenue creation project. Using modern cloud technologies moves organizations into the data centric world, focusing on business and not operations.

(1) https://research.google.com/archive/mapreduce.html