The Secret to Becoming a Great Data Engineer with Zach Wilson (DataExpert.io, Facebook, Netflix)
Summary
TLDRZach Wilson, the creator of DataEngineer.io, shares his perspective on the challenges and rewards of data engineering and the importance of community in learning. He discusses his experiences with SQL, the limitations of using AI for generating complex queries, and the role of crowdsourcing knowledge. Emphasizing practical, experience-based teaching, Zach highlights his approach of bringing in experts to cover areas outside his expertise. He also introduces TechCreator, a platform designed to empower other creators to build and launch cohort-based courses, offering new opportunities for educators in the data community.
Takeaways
- 😀 Zach Wilson expresses skepticism about using ChatGPT for generating SQL queries, highlighting the potential for subtle errors in more complex queries that can be hard to debug.
- 😀 He points out that while ChatGPT can handle simple queries, it struggles with queries involving more intricate logic, often leading to mistakes that are hard to detect by less experienced users.
- 😀 Community is a crucial aspect of data engineering for Zach, as he enjoys engaging with others and values crowdsourcing knowledge to gain different perspectives on complex topics.
- 😀 Zach acknowledges his own blind spots, such as not having used popular tools like dbt, and demonstrates a willingness to learn from others to fill those gaps in his expertise.
- 😀 In his bootcamp, Zach has learned the importance of deep, experience-based teaching and now plans to involve guest lecturers with specific expertise to cover areas outside of his knowledge.
- 😀 After a taxing experience with filming 65 hours of content in six weeks, Zach plans to reduce his personal workload in the next bootcamp, delegating some lessons to external experts.
- 😀 Zach's entrepreneurial projects include DataEngineer.io, which focuses on bootcamps, and TechCreator, a platform designed to help content creators launch and manage cohort-based courses independently of third-party platforms.
- 😀 The TechCreator platform will allow creators to index and organize their social media content, making it easier for them to manage and present their knowledge to their audience.
- 😀 Zach is committed to avoiding revenue-sharing platforms like Maven, choosing instead to build and host his own content platform where he retains full control over his earnings.
- 😀 In the future, Zach plans to continue growing the TechCreator platform, helping other thought leaders and educators build and scale their own educational businesses while offering more flexible course delivery options.
Q & A
Why does Zach Wilson not recommend using ChatGPT for SQL generation in data engineering?
-Zach Wilson explains that while ChatGPT can generate SQL queries, it often introduces subtle errors that are difficult to catch, especially in larger or complex queries. These errors can lead to incorrect data being returned, even if the query itself runs without syntax errors. He also notes that, as an experienced SQL practitioner, he can write small queries faster than he could formulate a prompt for ChatGPT.
How does Zach Wilson view the role of community in data engineering?
-Zach values community highly, especially after moving away from big tech. He believes in the wisdom of crowds, where multiple perspectives and shared knowledge help identify solutions and fill knowledge gaps. He often posts hot takes on LinkedIn to encourage debate and learn from others, seeing the community as an essential resource for growth and collaboration.
What is Zach's perspective on the usefulness of crowd-sourced knowledge?
-Zach finds crowd-sourced knowledge invaluable. He references a story his father told him about estimating the length of a line in a class, where the collective estimates were remarkably accurate. This example highlights the power of crowdsourcing and how collective input can provide valuable insights, even from people who have no specialized tools or measurements.
Why does Zach not teach the dbt class in his bootcamp?
-Zach acknowledges that although dbt is a popular tool in data engineering, he has never used it personally. Since he lacks deep experience with dbt, he feels it would be better for an expert in that field, like someone with real experience, to teach the class instead. He emphasizes that his bootcamp aims for deep, experience-based teaching, and that others should contribute where they have greater expertise.
What changes is Zach making to his bootcamp in its second iteration?
-In the second iteration of his bootcamp, Zach plans to reduce the number of lectures he personally teaches, from 22 in the first iteration to around 16. The remaining lectures will be taught by guest instructors or experts in the field, which will allow for a broader range of specialized topics and insights, ensuring students benefit from real-world experience.
What is the goal of the TechCreator platform?
-TechCreator is a platform Zach is developing to help influencers and creators launch their own cohort-based courses. It aims to empower creators by providing a way to monetize their expertise without relying on platforms that take a percentage of earnings. The platform will also allow users to index and organize their social media content, creating searchable archives to share their knowledge more efficiently.
Why does Zach prefer not to use platforms like Maven for his content?
-Zach prefers not to use platforms like Maven because they take a significant cut of the revenue, which he feels is unjustified. He believes that, as an entrepreneur, only companies like Stripe, which provide essential payment infrastructure, should take a percentage of his business. Instead, he opts for a more cost-effective model where he pays flat monthly fees for services, avoiding a percentage-based cut.
How does Zach handle the distribution of his content on DataEngineer.io?
-Zach has built his own platform for DataEngineer.io rather than relying on third-party content platforms. This gives him full control over the distribution and monetization of his content, allowing him to avoid giving away a significant percentage of his revenue while still providing a deep, experience-driven learning environment for his students.
What does Zach see as the main advantage of the community-driven approach to learning in data engineering?
-Zach believes the main advantage of a community-driven approach is that it provides diverse insights from people at various stages of their careers. This collective wisdom allows for deeper understanding and new perspectives that may not be possible through traditional, isolated learning methods. It also fosters collaboration and allows for sharing of real-world, practical knowledge.
What is Zach's view on the importance of guest lectures in his bootcamp?
-Zach sees guest lectures as a key component of his bootcamp's success. By inviting experts in specific areas, he ensures that students receive high-quality, specialized knowledge. He recognizes that there are many aspects of data engineering that others are more knowledgeable about than he is, and he values the opportunity to bring in people who can provide deeper insights on those topics.
Outlines

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowMindmap

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowKeywords

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowHighlights

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowTranscripts

This section is available to paid users only. Please upgrade to access this part.
Upgrade Now5.0 / 5 (0 votes)