Big SQL Analytics: What Is It and Why Should You Care?
Today, analytics have become mainstream in the majority of organizations. Moreover, most of the analytics and data warehousing implementations have been built using traditional relational database platforms such as SQL Server, Oracle, or DB2. As data volumes grow larger and more data sources come online, organizations are hitting the limits of these traditional platforms leading them to expand their infrastructure. If your organization is like most facing this challenge, you may be experiencing the limitations of these platforms even with the increased investments. The problem is that the scalability of these platforms is not linear with respect to increased hardware; bottlenecks inherent to the technology prevent it from being so. What is required is a leap forward and a complete paradigm shift—this is where big data comes in. Big Data architecture solves this issue by providing massive parallelism to enable linear scalability as you grow your hardware platform. However, as you start investigating big data, you will find that it is going to be prohibitive for you to switch your existing analysts into this new platform without significant retooling and re-staffing. Neither of these options may be available to you—nor are they necessary to achieve big data-scale.
The ideal solution would allow you to take your existing analysts and shift them over to a big data-style platform that supports SQL (or a SQL-like functionality) for minimum knowledge and processes transfer. In reality, this option has now existed for a long time. Netezza, for example, is a massively parallel, high-performance data warehousing appliance that has been around for over a decade. Microsoft and Teradata also have competing offerings. The downside to these solutions is the high cost of entry and significant expertise required in administering and configuring them.
Recently, Hadoop has inspired several new companies to offer competing tools and technologies to traditional data warehouse appliances that are accessible from a cost perspective and easy to launch either on-premise or in the cloud. These offerings are turning the analytics landscape on its head by offering SQL capabilities at Big Data-scale and a very low cost of entry. Some companies are referring to this as “Big SQL.” In fact, IBM has a new product with that name.
At StatSlice, we have started proof of concepts (POCs) with several new tools offering Big SQL capabilities. A lot of these tools are still in beta testing; however, early results show extreme improvements over traditional data warehousing platforms.
Over the coming months, we will be posting the results of these POCs and distilling them into analytics roadmaps recommendations for our clients, so stay tuned.
Bernard Wehbe
Bernard Wehbe is responsible for operational strategy, account management and the management of day-to-day company activities at StatSlice Systems. He has over twelve years of consulting experience focused exclusively on data warehousing and business intelligence. His experience includes data warehousing architecture, OLAP, data modeling, ETL, reporting and dashboarding, business analytics, team leadership, and project management. His business systems expertise includes but is not limited to Consulting Services, Financial Services, Retail, Transportation, Manufacturing, Telecom, and Online Subscription. Bernard received a Bachelor of Science as well as a Master of Science in Industrial Engineering & Management from Oklahoma State University.