Saturday, December 08, 2012

Solution is a state not a path

Networking is going through a identity crisis. Is it a special device or just a program on a commodity machine? The crux of the problem lies in a shift in the problem space where networkers spent all their time. The solution to a networking problem today is not the best path but the best state. In the early years this branch was taken by networkers because in those finding the best path in a graph was the solution to getting the applications (in those days printing, later email) to work. But today, a simple struts application has more action elements (routes) than a BGP table on internet. In other words, there are more paths a browser can take after it has landed on the web server than it had to take getting to the web server.

When state mangement and control is the problem, the solution is software. The paths are discovered and known what matters now is the learning on which ones for which application. This is what is causing the identity crisis for networkers which then manifests itself as SDN, Programmable Networks etc. Also the reason why the largest networking company flips flops between we are going to focus on networking to we want to be an IT company. Hope it does not do what freeport did - a copper company acquiring a O&G company.

Tuesday, December 04, 2012

State Description Language

After almost a decade after WSDL - the service interface description language - was introduced, I think it is time for industry to create a state description language. Just like WSDL enabled communication of interface between the two points engaged in a conversation (or any other MEP), we need two infrastructure components inside a data center to communicate their state among one another. The end goal is autonomic data center where a component "learns" about the fellow components and reflects and adjusts its own state.

Application integration is easy today. Everybody has an API on which default information (useless) can be communicated. But to use this new found connection we need to develop languages that enable communication about data, state, intention etc. The whole movement to cloud has commoditized compute, storage to a point where $5/month can buy me more IT infrastructure than my university had when I was studying engineering...

Thursday, September 20, 2012

Biggest Bottleneck in Bigdata is the human

Today's WSJ carries an article on Bigdata being the brain behind hiring in companies. There are lots of Bigdata articles all around and each one points to a new bottleneck for the industry to overcome. There is one bottleneck that no one discusses. It is the ultimate consumer of Bigdata - the human. If we have trouble getting computers to deal with Bigdata, imagine presenting the analysis to a human. We are simply not wired to consume all this analysis. That is where visualization steps in.

So what is visualization of Bigdata? It is the rendering of insight in a data analysis using images, animations, selective disclosures, progressive disclosures or charts/figures/clouds/bubbles etc. The challenge here is not in rendering these visual elements, but in mapping these elements to the data that is sourced across the internet, parsed by multiple parsers, collated/curated and correlated with multiple streams and displayed on a canvas that is hosted in yet another place. Then there is the debate of HTML5 vs native too.

Visualization of Bigdata is a field actively targeted by research community and there is a strong business case for it as well: it solves the biggest bottleneck in the bigdata ecosystem i.e. the human.

Monday, March 26, 2012

IM is the Best Overlay Network

Under the moniker of Software Defined Network or SDN a flurry of new technical proposals are vying to become the next standards. The business case for them is very sound i.e service providers after failing in their first attempt in late 90s and early 2000s are now confident that there is a way to monetize services which run inside datacenters. The problem is of course the scale at which they need to operate requires that services run across traditional boundaries set up by networking. To cross these boundaries they are devicing clever ways of getting bits across a policy boundary using overlays (envelopes inside other envelopes). The debate is lately on which layer's envelope shall I use. Layer-2 or Layer-3.  Ok, so at the end of day what is needed is a mechanism that abstracts the underlying network and presents it to the applications over sockets in a programmer friendly fashion .

What the networking guys need to do is to look at how this problem was solved in application world over a decade ago. Hint: Use HTTP for control plane, XML for management plane and whatever you want for data plane. The best overlay network in the world continues to be IM (instant messaging)

Monday, February 27, 2012

SDN vs SdN

May be they should have define software Driven network  as application driven network so as not to confuse it with software defined network (openflow and the rest...).  The former is all about network applications using the open APIs on the network to provision, reserve and optimize the path that the application data takes. The open API could be network as a service. The ultimate goal of this initiative is programmable network.

The latter is a rearchitecture of the switch with focus on virtualization of the hardware forwarding plane. A multi device network operating system and bunch of controllers that use use the NOS to configure the devices. The ultimate goal of this initiative is operational excellence.

Thursday, February 23, 2012

Innovation is its own Business Case

"What is the business case for this innovation?" Heard that?

Innovation or lack of it is the reason for 2.x% GDP growth in the developed world and we still hear this call for justification. Fed funds rate is 0-0.25% and yet we have to justify NPV of a project. Everybody from company president to the country president is asking (no begging) for innovation, but we need to prove ROI for a project. The chinese did not need a business case to build so called "Ghost cities". They consume 40% of copper production in the world without business case. They are the equivalent of Apple in the world today with surplus cash to buy out several greek deficits. They even placed their machine on the Top 500.

Innovation is its own business case. You cannot calculate the ROI of innovation because you cannot measure the return. So think of the "I" in ROI as your ability to finance (means) and risk tolerance.

Tuesday, February 07, 2012

SDN

When J Hamilton said network gets in the way, I believe he was asking for a economical solution kind of like the way x86 based servers replace risc processors. The x86 world did not bring complexity, unknown paradigm with unknown cost and difficult to estimate performance. Software Defined Networking a.k.a SDN is apparently doing just that. What started as simple protocol to read/write/replace entries into a switch's forwarding table now has attracted a code bloat and frameworks which promise a network which does substantially what is already done with VMWare's vswitch.

The original call to simplify the network listed a few use cases as drivers which can be categorized as 1) Workload placement (includes all shapes/forms of vm mobility, code mobility etc.) and 2) big data processing. The first challenge attracted solution such as overlays and tunnels which can extend a vm's notion of its layer-2 domain within and across datacenters. What we need now is a standardized way of doing this not a new framework which disassembles a network devices data, control and management plane and places one or more of those on general purpose CPU. We had already done some of this in the olden day with Infiniband's subnet manager and come to the conclusion that it does not scale for dynamic workloads. The second challenge is still in the process of becoming a challenge. With all the hype behind bigadata, an average cluster running bigdata analysis is under 40 nodes. Even Hadoop does not scale beyond 4K nodes.

What the datacenter really wants is a cheap network switch that cost no more than $10/GByte of traffic in motion. $10/GB is the same metric used to guage the efficacy of storage service. It should not cost more than $10 for a gigabyte of data in store or in motion (on the wire) and in the future in memory on the compute side.

Costs in Training LLMs

 I went through the Llama-2 white paper that was released with the model by meta. I was hoping to learn some special technique they may be ...