Netezza surprises with technical capabilities

FPGAs, zonemaps, and a strong roadmap


Comment I have recently returned from Netezza's second annual conference. This was well attended, with nearly all of the company's customers (around 75) being represented, as well as a significant number of both prospects and partners.

It was very (to use a technical term) buzzy and there was a degree of enthusiasm that I have rarely encountered. However, what was most interesting for me was the number of things I had not previously appreciated about Netezza's technical capabilities. And, of course, its roadmap for the future (though I can't say too much about that).

To begin with there is the question of indexes. Data warehouse appliances in general, and Netezza in particular, tends to be type cast by detractors as only being good for large table scans, because they do not support indexes and therefore cannot run complex joins.

However, in the case of Netezza, at any rate, this is misleading. This is because it uses what might be described as an anti-index, which is called a zonemap.A zonemap allows you to load say, sales by time, and then the zonemap breaks the relevant data down into blocks, storing the details of the first and last record in each block (thus there is a much lower overhead compared to an index).

What this means is that when you run a query you only read the blocks that contain the data you are interested in, ignoring all the other blocks. This ability to limit the data you read means that joins are much more effective than would otherwise be the case. In its roadmap, Netezza described future approaches that will further reduce the amount of data you need to read.

Another interesting thing to come out of the conference was that a number of Netezza customers have stopped using aggregates as a result of implementing Netezza. For example, Carphone Warehouse told me that it was both faster and more accurate to calculate directly from the raw data.

As aggregates are a major issue for database administrators, being able to get rid of them (or, at least, minimise their use) is a significant benefit. Not that Netezza eschews aggregates altogether. More than one user employs a data warehouse appliance (not only from Netezza) as an aggregating engine as a front-end to a third party enterprise data warehouse. I will discuss this further in a subsequent article.

And while talking about enterprise data warehouses (EDW), there are several arguments put against using a data warehouse appliance as an EDW. The first is that you can't use an appliance for complex joins but, as discussed above, this is less and less true, at least as far as Netezza is concerned.

Secondly, there is the issue that the large EDW vendors provide pre-built data models - well, one of the things that Netezza has not made much of is the fact that it has partners that provide exactly these sort of capabilities (typically built on either a star or snowflake schema).

And, thirdly, there is the question of managing mixed workloads. In this last case, Netezza offers guaranteed resource allocation (floors but not ceilings yet), short query bias, materialised views, and prioritisation.

Another area in which Netezza has been hiding its light under a bushel is in the matter of FPGAs (field programmable gate arrays). FPGAs are used to process data as it is streamed off disk. Note that this is important to understand. Most data warehouse appliances (and, indeed, conventional products) use a caching architecture whereby data is read from disk and then held in cache for processing. Netezza, on the other hand, uses an approach that queries the data as it comes off disk before passing the results on to memory. In other words it uses a streaming architecture in which the data is streamed through the queries (whose programs have been loaded into the FPGA) rather than being stored (even if in memory) and then queried.

There are several points to make about this. The first is that you can get much better performance when using this sort of approach than when using a conventional one. For example, it is stream-based processing that is used for algorithmic trading, where processing requirements are of the order of 150,000 transactions per second.

The second is that FPGAs are the natural way of handling streaming environments. For example, they are widely used for voice and video streaming. They are not yet used for event stream processing, but we know of one vendor that plans to do exactly that.

In turn, what this means is that FPGAs are very much a commodity item. Those of us working in more conventional environments may not think of FPGAs like that, but they are as much of a commodity as, say, an Intel processor.

And talking about processors, the other thing that Netezza uses that may seem odd to some people is that it employs a PowerPC chip rather than using said Intel (or AMD). Again, this is similarly a commodity device that is widely used in small footprint devices, primarily because of its low power consumption.

To be specific, a Netezza Snippet Processing Unit (where a snippet is the compiled SQL query that data is streamed through) requires just 30 watts. A complete Netezza rack with 112 of these and 16.5Tb of disks (with 5.5Tb of user data) requires little more than 4Kw and produces 12,000 BTU heat output. Given the power and cooling issues afflicting most data centres today, this is a substantial advantage, as are the reduced floor space requirements.

Returning to FPGAs for a moment, the performance and price of these is following along a similar price/performance curve as those of processors. It is expected that performance and price will both improve by five times by 2010, as will the amount of logic that you can put on an FPGA. This last is particularly important because it will enable Netezza to introduce even more functionality into the FPGA in the future.

Even with the current FPGAs, Netezza plans to introduce features that will increase raw scan-rate performance, tactical query performance, and advanced analytic performance. The advanced analytic capabilities will be made available to partners rather than end users and will allow predictive analytics vendors (like SPSS or SAS) to embed scoring capabilities (say) directly into the FPGA, which should provide significant performance advantages.

Another potential use of the functionality embedded in the FPGA would be to implement column-level encryption, which would be useful for companies in the data aggregation and resale market, for example, because you could use different encryption techniques for each customer's data.

Encryption generally is not available and is not currently on the roadmap and while I would like to see this it is arguably unnecessary - given the structure of a Netezza appliance you would need some seriously good hacking skills to read a Netezza disk, even if you could get at one - so column-level encryption on its own may be good enough.

To conclude, I was surprised by this conference, not just by the enthusiasm of the attendees but also about some of the functionality that Netezza can offer, which I don't think it has done a good job of explaining to the market. It has, for obvious reasons, concentrated on performance, price and reduced cost of ownership but, to take TCO, it has tended to focus on the removal of indexes and tuning but hasn't discussed its advantages when it comes to aggregates.

Similarly, it hasn't really explained why using FPGAs are a good idea, it hasn't made it clear that zonemaps are a form of anti-index, and it hasn't talked much about its advantages in the data centre.

Given all of this, and adding in the rich set of new features in the company's roadmap (a number of which I have not mentioned), there is no reason to expect Netezza to do anything but go from strength to strength.

Copyright © 2006, IT-Analysis.com


Other stories you might like

  • Monero-mining botnet targets Windows, Linux web servers
    Sysrv-K malware infects unpatched tin, Microsoft warns

    The latest variant of the Sysrv botnet malware is menacing Windows and Linux systems with an expanded list of vulnerabilities to exploit, according to Microsoft.

    The strain, which Microsoft's Security Intelligence team calls Sysrv-K, scans the internet for web servers that have security holes, such as path traversal, remote file disclosure, and arbitrary file download bugs, that can be exploited to infect the machines.

    The vulnerabilities, all of which have patches available, include flaws in WordPress plugins such as the recently uncovered remote code execution hole in the Spring Cloud Gateway software tracked as CVE-2022-22947 that Uncle Sam's CISA warned of this week.

    Continue reading
  • Red Hat Kubernetes security report finds people are the problem
    Puny human brains baffled by K8s complexity, leading to blunder fears

    Kubernetes, despite being widely regarded as an important technology by IT leaders, continues to pose problems for those deploying it. And the problem, apparently, is us.

    The open source container orchestration software, being used or evaluated by 96 per cent of organizations surveyed [PDF] last year by the Cloud Native Computing Foundation, has a reputation for complexity.

    Witness the sarcasm: "Kubernetes is so easy to use that a company devoted solely to troubleshooting issues with it has raised $67 million," quipped Corey Quinn, chief cloud economist at IT consultancy The Duckbill Group, in a Twitter post on Monday referencing investment in a startup called Komodor. And the consequences of the software's complication can be seen in the difficulties reported by those using it.

    Continue reading
  • Infosys skips government meeting – and collecting government taxes
    Tax portal wobbles, again

    Services giant Infosys has had a difficult week, with one of its flagship projects wobbling and India's government continuing to pressure it over labor practices.

    The wobbly projext is India's portal for filing Goods and Services Tax returns. According to India's Central Board of Indirect Taxes and Customs (CBIC), the IT services giant reported a "technical glitch" that meant auto-populated forms weren't ready for taxpayers. The company was directed to fix it and CBIC was faced with extending due dates for tax payments.

    Continue reading
  • Google keeps legacy G Suite alive and free for personal use
    Phew!

    Google has quietly dropped its demand that users of its free G Suite legacy edition cough up to continue enjoying custom email domains and cloudy productivity tools.

    This story starts in 2006 with the launch of “Google Apps for Your Domain”, a bundle of services that included email, a calendar, Google Talk, and a website building tool. Beta users were offered the service at no cost, complete with the ability to use a custom domain if users let Google handle their MX record.

    The service evolved over the years and added more services, and in 2020 Google rebranded its online productivity offering as “Workspace”. Beta users got most of the updated offerings at no cost.

    Continue reading
  • GNU Compiler Collection adds support for China's LoongArch CPU family
    MIPS...ish is on the march in the Middle Kingdom

    Version 12.1 of the GNU Compiler Collection (GCC) was released this month, and among its many changes is support for China's LoongArch processor architecture.

    The announcement of the release is here; the LoongArch port was accepted as recently as March.

    China's Academy of Sciences developed a family of MIPS-compatible microprocessors in the early 2000s. In 2010 the tech was spun out into a company callled Loongson Technology which today markets silicon under the brand "Godson". The company bills itself as working to develop technology that secures China and underpins its ability to innovate, a reflection of Beijing's believe that home-grown CPU architectures are critical to the nation's future.

    Continue reading
  • China’s COVID lockdowns bite e-commerce players
    CEO of e-tail market leader JD perhaps boldly points out wider economic impact of zero-virus stance

    The CEO of China’s top e-commerce company, JD, has pointed out the economic impact of China’s current COVID-19 lockdowns - and the news is not good.

    Speaking on the company’s Q1 2022 earnings call, JD Retail CEO Lei Xu said that the first two years of the COVID-19 pandemic had brought positive effects for many Chinese e-tailers as buyer behaviour shifted to online purchases.

    But Lei said the current lengthy and strict lockdowns in Shanghai and Beijing, plus shorter restrictions in other large cities, have started to bite all online businesses as well as their real-world counterparts.

    Continue reading
  • Foxconn forms JV to build chip fab in Malaysia
    Can't say when, where, nor price tag. Has promised 40k wafers a month at between 28nm and 40nm

    Taiwanese contract manufacturer to the stars Foxconn is to build a chip fabrication plant in Malaysia.

    The planned factory will emit 12-inch wafers, with process nodes ranging from 28 to 40nm, and will have a capacity of 40,000 wafers a month. By way of comparison, semiconductor-centric analyst house IC Insights rates global wafer capacity at 21 million a month, and Taiwanese TSMC’s four “gigafabs” can each crank out 250,000 wafers a month.

    In terms of production volume and technology, this Malaysian facility will not therefore catapult Foxconn into the ranks of leading chipmakers.

    Continue reading
  • NASA's InSight doomed as Mars dust coats solar panels
    The little lander that couldn't (any longer)

    The Martian InSight lander will no longer be able to function within months as dust continues to pile up on its solar panels, starving it of energy, NASA reported on Tuesday.

    Launched from Earth in 2018, the six-metre-wide machine's mission was sent to study the Red Planet below its surface. InSight is armed with a range of instruments, including a robotic arm, seismometer, and a soil temperature sensor. Astronomers figured the data would help them understand how the rocky cores of planets in the Solar System formed and evolved over time.

    "InSight has transformed our understanding of the interiors of rocky planets and set the stage for future missions," Lori Glaze, director of NASA's Planetary Science Division, said in a statement. "We can apply what we've learned about Mars' inner structure to Earth, the Moon, Venus, and even rocky planets in other solar systems."

    Continue reading

Biting the hand that feeds IT © 1998–2022