COSS Community 🌱

Cover image for OCS 2020 Breakout: Dr. Einat Orr
Joseph (JJ) Jacks for COSS Community

Posted on

OCS 2020 Breakout: Dr. Einat Orr

Einat Orr is the CEO and Co-founder of Treeverse, the company behind lakeFS, an open source project that empowers data lakes with ACID guarantees. She received her PhD. in Mathematics from Tel Aviv University, in the field of optimization in graph theory. Einat previously led several engineering organizations, most recently as CTO at SimilarWeb.

Relevant Links
LinkedIn - Twitter

In this talk, we’ll try to understand why, and share why we chose an open core business model for lakeFS.

Presentation topic: Why open core doesn’t fit all domains - 0:00

About Dr. Einat Orr - 0:32

Presentation agenda - 1:07

The observation: Atomic versioned data lake on top of object storage - 1:40

A Modern Data Architecture - 2:35

The Open Core Domains - 4:52

At the time where SQL was enough: Postgres and MySQL, IBMDB2 and Oracle - 5:10

Data is getting Big: NoSQSL DBs (BigTable in 2005 - “first acceptable attempt of NoSQL” - and more examples that followed. Cassandra, Elasticsearch, Redis, Neo4j, mongoDB, Scylla). Basic expectation for tools to be open source. - 6:11

Data is getting Big: distribute! (Hadoop in 2006, and the Hadoop ecosystem that followed. Oozie, Orc, Avro, Hive, ZooKeeper, HBase, Mezox, Flue, sqoop). Basic expectation for tools to be open source. - 7:52

Same trend happened with Ingest tools. Basic expectation for tools to be open source - 10:50

Trend continued into Orchestration, Observability, Metastores and Catalogs, Versioning, Mutable Formats - 11:23

Storage as a parable, starting with S3 original use cases - 13:04

S3’s current use case. As it becomes used for object storage, more object sotrage solutions join open-source ecosystem. (OpenIO, ceph, Minio). This shift shows that if you go into Big data world, whatever technology used to be closed source, if it joins the party, then someone will come in and answer basic expectation with an open-source solution. - 13:55

Analytics DBs. Part of the big data space, but most tools are closed source. Very few players in open source. - 15:06

Why would the business model be so different for Analytical DBs? The customer. You sell analytical databases to the business people/teams who want the insights (e.g. Management, strategy, finance, Marketing Ops, Sales Ops). Those teams don’t have developers. Hence, buying software is more natural. Because the customer remains not-a-developer, the analytical DBs remain not-open-source. - 18:12

Share your questions and comments below!

Top comments (0)