Seagate connects Hadoop and Lustre in an open sourcery ceremony

Streamlining workflows, and easier processing for Hadoop-using apps

Seagate has written a Hadoop connector for Lustre, meaning Hadoop-using systems can now fetch data from a Lustre parallel file system array, as part of a small contribution by the US data storage company to an open source world.

The Hadoop on Linux Connector (HoLC) means that data stored on a Lustre system doesn’t need copying from that data store to an HDFS store before Hadoop-using applications can process it.

Hadoop tools such as Mahout, Hive and Pig can use a Lustre filesystem.

Seagate is releasing patch source code for Hadoop that enables diskless Hadoop clusters to access data on a Lustre HPC-style data store. Overall, Seagate claims HoLC can streamline Hadoop workflows.

Seagate’s acquired Xyratex business acquired some Lustre IP, its website, logo and trademark from Oracle in February 2013.

It is now transferring assets relating to to OpenSFS (Open Scalable File Systems) and EOFS (European Open Filesystem SCE), arguing these two are trusted stewards of the Lustre community.

Seagate is far from walking away as it contributes to OpenSFS at the highest ‘Promoter’ level, and still sits on its board.

Another example, it says, of its open source credentials was making its Ethernet Drive (Kinetic) interface specification and T-Card developer adapter available to the Open Compute Project in January this year. ®

Similar topics

Other stories you might like

Biting the hand that feeds IT © 1998–2021