Stats/Code/ML/AI

The Most In Demand Skills for Data Engineers in 2021

The Most In Demand Skills for Data Engineers in 2021
If you are preparing to make a career in data or are looking for opportunities to skill-up in your current data-centric role, then this analysis of in-demand skills for 2021, based on over 17,000 Data Engineer job postings, should offer you a good idea as to which programming languages and software tools are increasing and decreasing in importance.
Read More

How to Extract Data Observability Metrics from Snowflake Using SQL

How to Extract Data Observability Metrics from Snowflake Using SQL
Monitor the health of your Snowflake data pipelines with these simple queriesImage courtesy of Sydney Rae on Unsplash.Your team just migrated to Snowflake. Your CTO is all in on this “modern data stack,” or as he calls it: “The Enterprise Data Discovery.” But as any data engineer will tell you, not even the best tools will save you from broken pipelines.
Read More

11 Dimensionality reduction techniques you should know in 2021

11 Dimensionality reduction techniques you should know in 2021
Reduce the size of your dataset while keeping as much of the variation as possiblePhoto by Nika Benedictova on UnsplashIn both Statistics and Machine Learning, the number of attributes, features or input variables of a dataset is referred to as its dimensionality. For example, let’s take a very simple dataset containing 2 attributes called Height and Weight. This is a 2-dimensional dataset and any observation of this dataset can be plotted in a 2D plot.
Read More

Python Altair Combines Filtering, Grouping, and Merging into a Single Data Visualization

Python Altair Combines Filtering, Grouping, and Merging into a Single Data Visualization
A complete tool for exploratory data analysisPhoto by Isaac Smith on UnsplashAltair is a statistical data visualization library for Python. It provides a simple and easy-to-understand syntax for creating both static and interactive visualizations. What I think separates Altair from other common data visualization libraries is that it integrates data analysis components into the visualizations seamlessly. Thus, it serves as a highly practical tool for data exploration.
Read More

Python Hockey Analytics Tutorial

Python Hockey Analytics Tutorial
Python is an open-source programming language that can be used for a wide variety of applications such as data analysis, data science, and data visualization, software and web development, and writing scripts for systems. Python is so powerful that some parts of the MacOS actually rely on it, and the combination of this power and Python’s intuitive, user-friendly syntax make it one of the most popular programming languages in the world. If this sounds like something you want to learn — especially within the context of analyzing hockey statistics — you’ve come to the right place. By end of this tutorial, you will have not only a base-level understanding of Python as a programming language, but you will be comfortable enough in Python to perform small-scope data analysis on your own.
Read More

Unusual Opportunities for AI, Machine Learning, and Data Scientists

Unusual Opportunities for AI, Machine Learning, and Data Scientists
Here some off-the-beaten-path options to consider, when looking for a first job, a new job or extra income by leveraging your machine learning experience. Many were offers that came to my mailbox at some point in the last 10 years, mostly from people looking at my LinkedIn profile. Thus the importance of growing your network and visibility, write…
Read More

What Is Semi-Supervised Learning

What Is Semi-Supervised Learning
Semi-supervised learning is a learning problem that involves a small number of labeled examples and a large number of unlabeled examples. Learning problems of this type are challenging as neither supervised nor unsupervised learning algorithms are able to make effective use of the mixtures of labeled and untellable data. As such, specialized semis-supervised learning algorithms […]
Read More

Enrich Tableau Prep flow data with Einstein Discovery predictions

Enrich Tableau Prep flow data with Einstein Discovery predictions
In the April release of Tableau Prep, you can now invoke the power of Salesforce Einstein Discovery to bulk score your data directly in your flow. Bringing these powerful predictive models into Tableau Prep will help people closer to the business to use advanced analytics techniques to uncover practical insights, inform proactive decisions, and solve problems faster. Rapinder Jawanda Kristin Adderson April 6, 2021 - 7:31pm April 7, 2021 In the April release of Tableau Prep, you can now invoke the power of Salesforce Einstein Discovery to bulk score your data directly in your flow. Bringing these powerful predictive models into Tableau Prep will help people closer to the business to use advanced analytics techniques to uncover practical insights, inform proactive decisions, and solve problems faster. With this integration, you…
Read More