The Data Dash #2: AI New Frontiers, Zucky and bananasHappy New Year! I’m truly humbled from your responses to my first newsletter last month. The number of new subscribers has been incredible, and I deeply appreciate your support. What started as an experiment in 2024 has turned into something I genuinely enjoy. So, as we step into 2025, I’m making this my new routine. I hope you’ll continue to join me each month in exploring what happened new in tech. I hope you’ll find as much value in reading them as me writing them. (–) En dashesLet’s start with an absolute mic-drop moment from DeepSeek: before the end of 2024, they released their new LLM model, DeepSeek-V3. DeepSeek has open-sourced the weights of this frontier-grade LLM, which was trained on 2,048 GPUs over two months at an estimated cost of approximately $6 million. This level of capability is typically thought to require clusters closer to 16,000 GPUs. However, current cutting-edge clusters are scaling up to around 100,000 GPUs. For context, Llama 3 (405B) consumed 30.8 million GPU-hours during training. In contrast, DeepSeek-V3 appears to be a more efficient model, requiring only 2.8 million GPU-hours—approximately 11 times less compute. While 2024 was marked as the year of more efficient AI inferencing, 2025 is shaping up to be the year of more efficient generative AI (GenAI). The data protection market saw some significant acquisitions and IPOs in 2024. Notably, Cohesity and Veritas completed their merger and subsequently spun off Arctera. The new entity, Arctera, retains key product lines formerly under Veritas, including InfoScale, Backup Exec, and the Data Compliance and Governance group—products that primarily target SMBs and overlap with Cohesity’s offerings. This strategic move positions Cohesity among the top three products in terms of market share. Public cloud providers are making massive investments in AI infrastructure to drive client demand. However, David Linthicum argues that this could have the opposite effect. He predicts that micro-cloud providers offering niche services will emerge as the real winners in this space. Meanwhile, OpenAI is seeking additional funding and has started signaling a shift in its structure from non-profit to for-profit. Simultaneously, it is expanding its presence in Washington to secure greater political influence and energy resources for its data centers. The energy demands of AI are no secret— it actually needs so much power it makes your worse. This has escalated quickly! New U.S. regulations now target AI models trained with more than 10^26 operations, effectively covering models with 70 billion parameters or more. Interestingly, open-weight models are exempt from these restrictions. GPUs with TPP exceeding 4,800 - such as Nvidia’s H100 and A100—are also included under the new rules. Export restrictions now extend to additional countries like Singapore and some EU nations, including the Czech Republic. It’s ironic but true: CrowdStrike exemplifies how cybersecurity companies can thrive even after they fail. Despite a major outage in 2024 caused by a faulty software update that disrupted millions of systems globally, CrowdStrike continues to grow. After Elon Musk’s antics, Mark Zuckerberg has gone bananas as well. However, this has started some interesting progress in the social network space. Mastodon is transitioning into a non-profit organization to reinforce its decentralized ethos. Meanwhile, Free Our Feeds - a new initiative - aims to “liberate social media” using the AT Protocol developed by Bluesky, originally founded by Twitter’s Jack Dorsey. (—) Em dashesIf you want to get on a journey and think little bit deeper about AI read this long piece Machine of Loving Grace by Dario Amodei (Anthropic CEO). Lots of interesting points and predictions on AI. The team at Applied LLMs shares a practical guide to navigating challenges around building with LLMs, covering everything from crafting killer prompts to avoiding the dreaded “Frankenstein prompt” and embracing retrieval-augmented generation (RAG) over finetuning for better results. Their advice boils down to this: keep it simple, structured, and scalable. Technical but very insightful. Private Cloud is a Trap. Or not?I have watched video from Keith explaining why he thinks “Private Cloud is a Trap” and typical enterprises shouldn’t build the private cloud and instead run hybrid infrastructure. I thought I will dive deeper on this and why I think enterprise should embrace private cloud. Keith’s point is that rapid elasticity or infinite capacity is impossible in private cloud environments, and for that, you should use the public cloud. In my opinion, rapid elasticity or infinite capacity doesn’t exist in general—and I mean in the public cloud either. It has only been abstracted from you as the user, so you think there is infinite capacity, but try to hit quotas and you’ll realize that “cloud is just someone else’s computer.” If you are a user (e.g., a software developer), it is not your responsibility to worry about capacity or resources - that should be the problem of your platform team. I don’t believe your software developers should have access to infinite resources anyway - there is a good reason why everyone using public cloud should implement guardrails into their self-service, whether for public or private cloud, so you will not be crying on Reddit about how much AWS charged you after accidentally leaving instances running. Achieving rapid scalability in private cloud isn’t necessarily impossible, provided resources are thoroughly planned and monitored. Observability is crucial - not only for monitoring resources or user experiences but also for predicting future resource requirements. If your observability stack isn’t implemented properly - as is the case for many enterprises - you cannot accurately forecast demand or provide sufficient resources where and when needed. In private cloud environments, predictable economics are key. Cloud’s rapid scalability is the key differentiator from on-prem datacenters and what made public cloud so successful. To replicate public cloud-like elasticity within your own datacenter, one effective way to achieve this is by adopting a private cloud model, which provides on-demand resource provisioning, rapid scaling, and robust cost management. I agree with Keith that for typical enterprises, building a private cloud is not easy, but I think it’s worth the pain. Scalability, billing, and self-service are key to successful adoption of a private cloud - or any cloud. It is important to ask why scalability and billing often work in the public cloud but can be more challenging in private cloud environments. Building a billing system in your private cloud involves multiple components: self-service, a high level of automation, and integration across diverse APIs and technologies, such as Cisco, Pure Storage, VMware, and Kubernetes. If you really want to simplify things, you could opt for a rack-scale system from a vendor like Oxyde, which offers a single API and technology stack. This setup can greatly simplify your automation efforts by standardizing the infrastructure layer. Once automated, you can apply resource tagging in your private cloud and feed those tags into reporting for usage and cost. This process must also be automated where possible. It is not a trivial task and often causes headaches for enterprises, because it’s not purely a technical challenge - it also involves reworking processes and retraining people. That’s why it can be so tricky. It is not just a private cloud problem - this is a skills and people problem. Having all of the above, you have hopefully created an incentive for developers and application owners to move workloads on-prem. If you try to force your developers, you will fail. I have seen many times that restricting developers’ access to public cloud creates an environment where “grey IT” flourishes. You need to incentivise your developers to deploy their applications outside the public cloud, and that will depend greatly on how expensive it is to run applications outside the public cloud in terms of deployment time (i.e., how quickly developers can deploy workloads) and cost per resource (i.e., the expense of running workloads). If your software developers have to request resources like CPUs, memory, and storage instead of using self-service, you are wasting their time and time is money. You are not creating an incentive for application owners to run their workloads outside the public cloud. They will stay in the public cloud. Deploying a standard stack of servers, storage, and VMware near a public cloud provider (e.g., an Equinix data center) is not enough to stay competitive. Relying solely on a hybrid infrastructure is too narrow and limiting for what developers and application owners require. Today’s IT landscape demands more than just hosting infrastructure - it calls for business agility, scalability, and self-service. For enterprises that need to evolve beyond basic infrastructure provisioning, a private cloud approach provides the right framework. Building a private cloud may be difficult, but tackling tough challenges is how we drive innovation forward. |
Industry insights on the key AI, semiconductor, multi-cloud, datacenter, and automation announcements from the past month.
The Data Dash #1: Intel - When good ones go (–) En dashes Let’s start with Special Report: Inside Intel, CEO Pat Gelsinger - This insightful article details Pat Gelsinger’s struggles to revitalize Intel. The company faces production delays, fierce competition, and internal challenges, while increasingly relying on government support through the CHIPS Act. All this has escalated this week as CEO Pat Gelsinger retired from the company. The US elections are over, and it seems Twitter’s influence...