Oh no, you're thinking, yet another cookie pop-up. Well, sorry, it's the law. We measure how many people read us, and ensure you see relevant ads, by storing cookies on your device. If you're cool with that, hit “Accept all Cookies”. For more info and to customize your settings, hit “Customize Settings”.

Review and manage your consent

Here's an overview of our use of cookies, similar technologies and how to manage them. You can also change your choices at any time, by hitting the “Your Consent Options” link on the site's footer.

Manage Cookie Preferences
  • These cookies are strictly necessary so that you can navigate the site as normal and use all features. Without these cookies we cannot provide you with the service that you expect.

  • These cookies are used to make advertising messages more relevant to you. They perform functions like preventing the same ad from continuously reappearing, ensuring that ads are properly displayed for advertisers, and in some cases selecting advertisements that are based on your interests.

  • These cookies collect information in aggregate form to help us understand how our websites are being used. They allow us to count visits and traffic sources so that we can measure and improve the performance of our sites. If people say no to these cookies, we do not know how many people have visited and we cannot monitor performance.

See also our Cookie policy and Privacy policy.

This article is more than 1 year old

Like an everflowing stream: New tech promises remote S3 nearline disk performance

Cool, but streaming doesn't mean screaming

Analysis You can't store files in Amazon's public cloud, access them on-premises, and expect local disk access performance.

You can store them in a sync-and-share facility like Box and Dropbox but then they have to be downloaded completely. It's not so good for large files, large data sets and production environments.

You could also use a cloud storage gateway, like Nasuni or Panzura, which works fine but adds complexity and may not scale.

Startup LucidLink claims it uses local metadata caching, parallel TCP/IP streaming, pre-fetching and caching to make public cloud-stored files usable for on-premises primary data storage.

It was founded by two ex-DataCore people, CEO Peter Thompson and CTO George Dochev. That background is relevant because DataCore uses parallelised IO in its record-breaking SPC-1 v1 benchmark results.

Dochev was DataCore's director of software engineering until June 2015. LucidLink was founded in January 2016, took in $1.6m in seed funding in December that year and has just had a second seed round, $5.5m, this year.

What they appear to have come up with is a faster way of streaming files from remote object stores.

Let's start by having files stored as objects in S3 buckets, Amazon being their first supported cloud.

The things that get in the way of being able to use NFS, CIFS or SMB to stream data from them for on-premises use are time and latency. TCP/IP, for example, is a chatty protocol, with many metadata message sequences as well as data transfer sequences.

Specialist suppliers, such as Bridgeworks in the UK, speed things up by parallelising TCP/IP streams and so cut the transfer time. That's part of what LucidLink's technology does.

An architecture diagram shows a LucidLink store in Amazon S3 and a LucidLink app (or agent) in the customer's server. This stores synced metadata from the Amazon-resident LucidLink store and presents the LucidLink files as part of the local server's OS file system and folders/mount points.

LucidLink_architecture_650

Click to enlarge

If a user's application needs a file then it is streamed from the AWS S3 store on-demand to the local server and available for use as soon as the initial set of bytes have been received. Parallel TCP/IP streaming is used; metadata chatter is reduced, and pre-fetching and caching speed things up as well.

A LucidLInk demo shows a server booting a Hyper-V VM from an Amazon S3 object store 140km away.

Youtube Video

It takes about a minute for the VM's login screen to appear. Once the VM is in the local cache then a subsequent boot takes about six seconds.

What we have here is a means of using cheap cloud object storage and accessing files there as if they were stored on a local disk, though a fairly slow one. We might suggest it can turn an S3 archive into a nearline file store.

The streaming tech is bi-directional. So LucidLink's agent could operate in a data-producing edge device which streams data up the cloud, e.g. video surveillance data. It has a customer doing this in production using the AWS Government cloud.

LucidLink provides its product tech as a subscription-based, pay-as-you-go service. It provides some cost comparisons to justify the worth of its technology:

  • AWS Elastic Block Storage (EBS) – $1,230 per TB per year
  • AWS Elastic File System (EFS) – $3,680 per TB per year
  • LucidLink + AWS S3 – $895 per TB per year

As LucidLink uses S3, in principle any S3-compatible object store could be used for its repository. That means Azure, with support coming, and GCP, which will be supported after Azure. Other potential targets are BackBlaze B2, Cloudian, Scality and SwiftStack.

Its roadmap includes live data replication and migration between regions in a cloud and clouds, snapshotting, audit capabilities, third-party software integrations and mobile support, in that order.

In theory, we think LucidLink could use its streaming file transfer technology to send data to/from public cloud file stores as well as object stores, if that became an economic thing to do. That way the currently necessary object-to-file translation process could be junked.

tl;dr

LucidLink is a software-based streaming and distributed cloud file storage access technology, using a public cloud S3 repository that provides nearline disk access speed to data in the cloud. ®

Similar topics

TIP US OFF

Send us news


Other stories you might like