转自:https://medium.com/castled/fastest-reverse-etl-platform-census-vs-hightouch-vs-castled-3d2975dd4e55
Fastest Reverse ETL Platform: Census vs Hightouch vs Castled
It is not even a year since the term Reverse ETL was coined. Since then, hundreds of modern data driven organisations have completed their “data integration loop” by syncing their valuable customer and product insights from the cloud data warehouse to their business tools.
While the entire data community has accepted Reverse-ETL as the missing piece in the modern data stack, there have been few debates happening of late about the need for speed in a Reverse ETL solution. Census started the discussion when they published this blog a month ago in which they claimed to be the fastest (44x faster than all the competitors) Reverse ETL Solution. They even encouraged other solutions to publish their respective performance benchmarks using their dataset.
While we do have a different take on the need for speed in a data integration solution, we thought it would be interesting to take up this challenge.
Benchmark Specifications and Results
The BigQuery table used in the benchmark was created by importing the CSV file shared by Census. Castled took 1 min 17 seconds to sync 2.2 million records from BigQuery to Mix panel Events API.
Out of this, it took 40 seconds to query and export the query results to the GCS bucket and around 35 seconds by our data sync framework to sync the data to Mixpanel Events API.
How do we compare with Hightouch and Census
We have taken the numbers published by Census for the same benchmark for our comparison. Since Hightouch has not published any official numbers for the same, we went ahead and tried the same benchmark on Hightouch’s cloud platform.
As per our benchmarks, Castled is 642% faster than Census and 1324% faster than Hightouch. Census clocked 8 mins 15 seconds to sync the same dataset while Hightouch took 17 minutes. Castled was able to sync the data at a throughput of 29351 records/second, while Census and Hightouch were able to handle 4565 records/second and 2215 records/second respectively.
Ramping up the load
While we were at it, we ramped up the load even further and checked how Castled scales up to larger data sets. We benchmarked Castled to scale up to 100 million records. Castled took ~27 minutes to sync 100 million records to Mix panel at the rate of 63613 requests/second.
Syncing 100 million records
We also observed that our throughput increases linearly with increase in the number of records synced. Thats majorly because the average time taken to query and export a record reduces considerably, as the size of the sample set increases.
So Is Castled the fastest Reverse ETL solution ?
As per the benchmarks done, currently we are way faster than Census and Hightouch. But claiming to be the fastest Reverse ETL solution at this stage when Reverse ETL is just starting up, seems a bit immature in our opinion.
While we know that we have built a highly scalable platform at Castled, we like to believe performance is just a number. Our engineering team comprises experienced engineers who have scaled systems from zero to near infinity. So we know that other solutions can potentially scale up their platforms in a short period with significant investment in engineering.
Do not just take our word for it
You do not have to just take our word on these numbers we have published. We are open source. You should be able to spin up Castled on your desktop or deploy it on-premise in under a minute and try this out yourselves.
This is our Github Repo: https://github.com/castledio/castled. If you find it useful, do not hesitate to show your support by starring our repo.