At Silectis, we deploy Magpie clusters across AWS, Google Cloud, and Azure. But because some of our internal infrastructure resides only on AWS, we need to establish private connections between these environments so that clusters on Google Cloud and Azure can access those private AWS resources. In this post, we’ll walk through the details of how we … [Read more...] about Tutorial: Creating a Multi-Cloud VPN with Terraform between AWS, GCP, and Azure
How to Ensure Security In Your Data Lake
THE IMPORTANCE OF SECURITY This post is the third in a series of posts about getting up and running with a Magpie Data Lake. In previous blog posts, we’ve discussed rapidly prototyping a data lake with Magpie, and automating loads into a data lake with Magpie. This post will address a third important piece of data lake infrastructure: security … [Read more...] about How to Ensure Security In Your Data Lake
Data Integrity Issues Holding You Back?
In this blog post, we explore how a data analyst can know when to trust data and when to be skeptical. Using data integrity features in your modern data engineering platform should help you sort this quickly, and move on to empowering decision-making across the company. DATA INTEGRITY: JUST THE FACTS People say you can trust data. “Let … [Read more...] about Data Integrity Issues Holding You Back?
Tutorial: Using Magpie to Implement a Cloud Data Lake
PILOTING A DATA LAKE WITH MAGPIE More and more companies are turning to data lakes as a way to unify and get value out of their growing collections of data. However, it can be a challenging to navigate the ever-changing technology landscape around these lakes, set one up, and quickly get value from it. At Silectis, we recommend that our … [Read more...] about Tutorial: Using Magpie to Implement a Cloud Data Lake
Data Lake Architecture Guide: Choosing the Right Storage Tool
Overview: Build Your Data Architecture to Enable Use Cases One of the things that we often wrestle with in building out data lake architecture is how to best lay out the infrastructure to support different analytical use cases, and more specifically, what storage mechanism might yield the best performance. One of the virtues of data lakes is … [Read more...] about Data Lake Architecture Guide: Choosing the Right Storage Tool
Magpie in Action: Job Management
A quick explainer on why we built we job management directly into Magpie, our data engineering platform. Plus, a step-by-step tutorial on how to use it. SETTING THE STAGE When first setting up a data lake, it is common for organizations to start with a static export of data. This enables users to immediately take advantage of the advanced … [Read more...] about Magpie in Action: Job Management