Sunday, March 13, 2016

Machine Learning, Deep Learning and Streaming Data Processors

When AlphaGo beat the human last week using ML to process a small set of board positions which its computing power could process, it proved that Deep Learning ML (that which uses algorithms vs. simply data) has arrived. But can the same machine analyze a streaming set of unrelated mouse clicks to identify a "hack"? Could it have blocked the hacking of NY Fed and saved Bangladesh $100M of lost funds?

As I refresh my understanding of ML - this time with Spark MLLib. I am thinking that this use case of streaming data analyzes with ML or Deep Learning is the NBT (Next Big Thing). To run this type of computational jobs, one requires a cloud because no small cluster will do and it needs a special fabric of the network because nothing enforces better than a policy on the network. Next generation compute systems and specially memory/microprocessor arch has found its killer app in streaming data processing just like GPUs found video games. This time the games are played by hackers and loss is measured in hundreds of millions.  

Sunday, March 06, 2016

Microservices and Container - Not perfect together


There is a continuous turf battle going on since 1990s between developer and deployer (a.k.a Admin). Currently the admin controls the security, scale and size of the application. The developer controls the content, architecture and interfaces to other systems. Two new emerging technologies empower (or shift power) these two constituencies. Microservices empower the developer while container empowers the deployer/admin.

If either wants to take off, they need to find other partners not each other. If microservices want to take off it needs to get the container monkey off its back. It shifts the responsibility of security and scale on the developer and away from the deployer. Developer should ideally be limited to interface design and selection of abstractions that make logic easy to codify. If the security and scale fall in the hands of developer it is going against the grain of the evolution of application where binding is done as late as possible and certainly not at design time. A datacenter administrator will be very uneasy deploying applications where developer has coded in the security policy and scale limit.  

Container has a similar story. Container is not a disruptive technology like it is portrayed to be nor is its effect as tectonic as virtualization or  Java/JVM. It needs to find a partner in a deployment technology like a vagrant or rails or PaaS. Currently its main value proposition is the time to bring up a execution environment. But to get that one has to give up security, naming, directory and identity. In other words, what it is doing is pushing those decisions to the developer who is the most ill equipped to handle those.  

Costs in Training LLMs

 I went through the Llama-2 white paper that was released with the model by meta. I was hoping to learn some special technique they may be ...