Product News

Sync with Confidence: Stay ahead of data surprises w/ warehouse-centric observability for rETL | Census

Katy Yuan
Katy Yuan May 12, 2022

Katy is a Product Marketing Manager at Census who loves diving into startups, SaaS technology, and modern data platforms. When she's not working, you can find her playing pickleball or Ultimate Frisbee.

Data teams desperately want to support the requests of business teams. But just when data engineers settle into impactful, career-building project work, they’re often interrupted by a data quality bug or a broken pipeline out of nowhere…👇 (Hint: Make sure you have your sound on 🔉)

Luckily, Census helps you level up your visibility and bug-fighting skills. 🐛

Today, we’re announcing an entire set of ✨ warehouse-centric observability ✨ capabilities that ensure data quality for reverse ETL, so data teams can sync data into downstream business tools with confidence (and without all the bug smashing).

🥇 Census: The most observable reverse ETL platform

Census has always had the fastest and most reliable reverse ETL connectors. However, shipping all your data everywhere isn’t the goal – it’s shipping high-quality data everywhere your ops teams need it.

With the best observability, logging, and debugging capabilities centered around your warehouse, Census provides a single place for the data team to track where data flows to downstream tools and the health of those pipelines.

Now, you can Sync with Confidence when sending and updating large volumes of data in downstream tools (meaning you can move faster when delivering data to ops teams).

Why “warehouse-centric” matters for observability

We believe all data and business logic should be centralized in the warehouse, the central hub for your data team.

Unlike other reverse ETL platforms, Census’s advanced observability tools work directly on your warehouse. No wrestling with S3 buckets, external files, or additional tools to get visibility into your data pipelines.

Native integration with your warehouse is a big deal:

  • All your logs are in the warehouse so you can analyze and combine them with all your other data.
  • All operations are compatible with SQL. You can even incorporate sync logs into your Census SQL models.
  • Use the tools you already have: BI tools, SQL clients, and data testing frameworks like Great Expectations or dbt.

What’s included in our observability suite?

🗄️ Sync Logs

Sync Logs provide detailed logs of each data point you’ve synced so you can audit, troubleshoot, and create alerts using the most granular information. We store logs directly in your warehouse so you can analyze and combine them with all your other data. You can also query them with SQL models in Census!

Logging in the warehouse enables transparency for all data flows 💪

Easily reference and understand what records failed and why, as well as which were successfully synced, so you can feel confident your data has synced correctly or quickly troubleshoot any issues that arise.

Read the blog or product docs.

Sync Logs are compatible with SQL clients, BI tools, dbt, and more

🕵️ API Inspector

The API Inspector helps you play Data Detective with real-time transparency around API requests and responses. See the API calls Census makes to destinations, down to the individual row, so you can track and fix errors as they happen.

Read the blog or product docs.

Drill down into requests & responses in real time while your sync is running

🔮 Sync Dry Runs New!

Sync Dry Runs provide a detailed summary report of how your data downstream will change – before you make any changes. The report shows the expected sync time, source record errors, and destination changes (number of records created, updated, and deleted). A dry run serves as a sanity check so both data and ops teams are aware of the scope of their updates and can make final approvals.

Sync Dry Runs are available today on Salesforce, our most popular destination, and will be available on other connectors very soon.

Read the product docs

Run a test sync without changing any records to preview how the destination will change

⚠️ Custom Alerting

With custom alerting, Census alerts you of invalid and rejected records and general errors to help you take proactive action to fix issues before they cause major breaks. Your alerts aren’t just limited to sync failures either. You can configure alerts per individual sync and trigger them based on a percentage threshold of rejected records. Alerts are available quickly via email, as well as Slack or other tools, when something goes wrong.

Read the product docs

Configure thresholds for alerts, per individual sync, to avoid alert fatigue

⛔️ Invalid / Rejected Records

In the Sync History tab of any sync, you can click into a specific failed run to see a sample of invalid or rejected records (up to 100), and the reasons why they failed. Census automatically generates this diagnostic log in the app UI for all failed syncs.

Read the product docs

Quickly see in the product UI any records that were skipped or rejected

Live Session: See observability in action

Register to see a product demo of a live debugging workflow and why Census is the #1 most observable reverse ETL platform:

Observability Product Showcase: Live Debugging for Reverse ETL

🗓️ Thursday May 26th, 11 AM PT / 2 PM ET

Sign up to save your spot at the webinar!

Availability

Every plan, including our free tier, has access to alerting and invalid/rejected records. Sync Dry Run and the API Inspector are enabled for any paid Census plan. Sync Logs are only available on Business plans and above.

👉 Get a demo of Census or try for free today!

What’s next?

We’re dedicated to helping data folks stand out 🦩 and drive more impact in the business, instead of spending their days overwhelmed by manual data tasks.

In the near future, we plan to expand our existing observability suite and build even more capabilities to improve visibility and data quality – so even more teams can sync with confidence. Our roadmap includes integration with dedicated observability platforms, field-level details for Sync Dry Runs beyond record-level, and more collaboration tools for ops users (like Salesforce admins) to get involved with upstream data.

Ultimately, we’ve aligned our mission around enabling both data and ops teams to achieve Operational Analytics, which means action, automation, and trust with data for everyone, regardless of technical skill level.

🥳 If you’d like to help us build this vision, we’re hiring!

Related articles

Customer Stories
Built With Census Embedded: Labelbox Becomes Data Warehouse-Native
Built With Census Embedded: Labelbox Becomes Data Warehouse-Native

Every business’s best source of truth is in their cloud data warehouse. If you’re a SaaS provider, your customer’s best data is in their cloud data warehouse, too.

Best Practices
Keeping Data Private with the Composable CDP
Keeping Data Private with the Composable CDP

One of the benefits of composing your Customer Data Platform on your data warehouse is enforcing and maintaining strong controls over how, where, and to whom your data is exposed.

Product News
Sync data 100x faster on Snowflake with Census Live Syncs
Sync data 100x faster on Snowflake with Census Live Syncs

For years, working with high-quality data in real time was an elusive goal for data teams. Two hurdles blocked real-time data activation on Snowflake from becoming a reality: Lack of low-latency data flows and transformation pipelines The compute cost of running queries at high frequency in order to provide real-time insights Today, we’re solving both of those challenges by partnering with Snowflake to support our real-time Live Syncs, which can be 100 times faster and 100 times cheaper to operate than traditional Reverse ETL. You can create a Live Sync using any Snowflake table (including Dynamic Tables) as a source, and sync data to over 200 business tools within seconds. We’re proud to offer the fastest Reverse ETL platform on the planet, and the only one capable of real-time activation with Snowflake. 👉 Luke Ambrosetti discusses Live Sync architecture in-depth on Snowflake’s Medium blog here. Real-Time Composable CDP with Snowflake Developed alongside Snowflake’s product team, we’re excited to enable the fastest-ever data activation on Snowflake. Today marks a massive paradigm shift in how quickly companies can leverage their first-party data to stay ahead of their competition. In the past, businesses had to implement their real-time use cases outside their Data Cloud by building a separate fast path, through hosted custom infrastructure and event buses, or piles of if-this-then-that no-code hacks — all with painful limitations such as lack of scalability, data silos, and low adaptability. Census Live Syncs were born to tear down the latency barrier that previously prevented companies from centralizing these integrations with all of their others. Census Live Syncs and Snowflake now combine to offer real-time CDP capabilities without having to abandon the Data Cloud. This Composable CDP approach transforms the Data Cloud infrastructure that companies already have into an engine that drives business growth and revenue, delivering huge cost savings and data-driven decisions without complex engineering. Together we’re enabling marketing and business teams to interact with customers at the moment of intent, deliver the most personalized recommendations, and update AI models with the freshest insights. Doing the Math: 100x Faster and 100x Cheaper There are two primary ways to use Census Live Syncs — through Snowflake Dynamic Tables, or directly through Snowflake Streams. Near real time: Dynamic Tables have a target lag of minimum 1 minute (as of March 2024). Real time: Live Syncs can operate off a Snowflake Stream directly to achieve true real-time activation in single-digit seconds. Using a real-world example, one of our customers was looking for real-time activation to personalize in-app content immediately. They replaced their previous hourly process with Census Live Syncs, achieving an end-to-end latency of <1 minute. They observed that Live Syncs are 144 times cheaper and 150 times faster than their previous Reverse ETL process. It’s rare to offer customers multiple orders of magnitude of improvement as part of a product release, but we did the math. Continuous Syncs (traditional Reverse ETL) Census Live Syncs Improvement Cost 24 hours = 24 Snowflake credits. 24 * $2 * 30 = $1440/month ⅙ of a credit per day. ⅙ * $2 * 30 = $10/month 144x Speed Transformation hourly job + 15 minutes for ETL = 75 minutes on average 30 seconds on average 150x Cost The previous method of lowest latency Reverse ETL, called Continuous Syncs, required a Snowflake compute platform to be live 24/7 in order to continuously detect changes. This was expensive and also wasteful for datasets that don’t change often. Assuming that one Snowflake credit is on average $2, traditional Reverse ETL costs 24 credits * $2 * 30 days = $1440 per month. Using Snowflake’s Streams to detect changes offers a huge saving in credits to detect changes, just 1/6th of a single credit in equivalent cost, lowering the cost to $10 per month. Speed Real-time activation also requires ETL and transformation workflows to be low latency. In this example, our customer needed real-time activation of an event that occurs 10 times per day. First, we reduced their ETL processing time to 1 second with our HTTP Request source. On the activation side, Live Syncs activate data with subsecond latency. 1 second HTTP Live Sync + 1 minute Dynamic Table refresh + 1 second Census Snowflake Live Sync = 1 minute end-to-end latency. This process can be even faster when using Live Syncs with a Snowflake Stream. For this customer, using Census Live Syncs on Snowflake was 144x cheaper and 150x faster than their previous Reverse ETL process How Live Syncs work It’s easy to set up a real-time workflow with Snowflake as a source in three steps: