top of page

When we learn new skills like programming, there’s a tendency to unknowingly pick up some bad practices. The problem is that unless someone tells you that you’re doing it wrong, those bad practices may stick with you in the long run. This post will cover the good coding practices that you can develop early, so that you don’t have to break bad ones later on. 


Practice Makes Perfect

Practice programming regularly - everyday if possible, but it’s totally fine to skip a day or two. Try to get out of “tutorial hell” as soon as possible and instead do small daily chunks of a personal project. It doesn’t matter what the project is, as long as it’s challenging and interesting enough to keep you motivated. I’m sure that most people in the tech space are already familiar with the 100 days of code challenge, which is perfect for initially getting into the habit.


The first personal projects that I did were for automating daily tasks for work, such as producing weekly revenue reports and sending reminder emails to clients with overdue invoices. When my friend group did secret santa a few years ago, I wrote a quick Google apps script to randomly assign people and then send everyone an email notification letting them know who their secret santa was.


I’ve gradually challenged myself with larger projects like gohan_dousuru, an automation project to generate meal plans and grocery lists for daily cooking.

ree

Here are some of the personal projects on my list, in order from easiest to most difficult.


  1. Create an email newsletter template in html.

  2. Add a cumulative visitor counter to my personal website.

  3. Web scraping and text me when a new issue of my favourite manga comes out.

  4. Build an app to manage my kitchen pantry inventory.

  5. Computer adaptation of a card or board game like Quarto, Battle Line, or poker. ※ I am not an expert on copyright law, but be careful with fan games because they could cause legal trouble. Do not make them distributable such as by putting it on a public Github repo and creating a link. 


Even if you’re just working alone on a hobby project, it’s good practice to use a version control system like Git and do atomic commits. 

ree

If possible, find someone to do pair programming, which often results in cleaner and more efficient code. Try to also get in the habit of gracefully using loops, dictionaries, and dynamic variable names for more elegant code. 


Finally, well-documented code is a valuable asset to any team. If you publish on github, include a readme file that concisely explains how to set up the software. The documentation can also include use cases, debugging scenarios, and acknowledgements for any resources that you found helpful and would like to give credit to. 

Posted: Aug, 19, 2021

Updated: May 19, 2024


While programming may not be a prerequisite for digital marketers, a bit of knowledge will go a long way. In this post, I will share some examples of how R and python, the de-facto standard for marketing technology at scale, can be useful for digital marketers.


Python for Digital Marketing

First, here are the most common python modules for SEO.

  • webbrowser: opens a browser to a specific page

  • requests: downloads files and webpages from the internet

  • bs4: parses html and xml

  • selenium: launches and controls a web browser

  • google-search-results: scrape and parse Google, Google Scholar, Bing, Baidu, Yandex, Yahoo, Ebay results using SerpApi

  • xml.etree.ElementTree: parses and creates xml data

.

webbrowser is part of the python standard library, so it should automatically be installed when you installed python3. For the remaining modules, you can install them with pip, which is also pre-installed. Simply type pip install followed by the module name. For example, pip install bs4.


To check whether the modules are installed run this test in terminal.

$ python -c "import webbrowser"
$ echo $?
0 # webbrowser module exists in system

$ python -c "import bs4"
Traceback (most recent call last):
File "<string>", line 1, in <module>
ImportError: No module named bs4
$ echo $?
1 # bs4 module does not exist in system

To get a list of installed modules without pip, type help("modules") in any python console. To do the same thing using pip, type pip list.


Optimize Image Data

Some content management systems like self-hosted WordPress don't support .heic files. Use this script to convert photos taken on your iPhone to .png

import pillow_heif
from PIL import Image

pillow_heif.register_heif_opener()

img = Image.open("C:/User/Desktop/photo1.heic")
img.save("C:/User/Desktop/photo1.png", format('png'))

The following script strips elif metadata from image files, for improved privacy and security.

image = Image.open("C:/User/Desktop/photo1.png")

data = list(image.getdata())
image_stripped = Image.new(image.mode, image.size)
image_stripped.putdata(data)

image_stripped.save("C:/User/Desktop/photo1.png")
 
image_without_exif.close()

In addition, Optimize Images is a command-line interface (CLI) utility to help you reduce image file sizes and boost page speed and performance.


Hashlib

This module implements a common interface to many different secure hash and message digest algorithms. It's already in the basic anaconda/python install so you don't need to get any new libraries.

URL scraping

This article explains how to scrape search results pages in bulk using python. The author provides an example of scraping hostels data from Hostelworld, but the same workflow can be applied for your specific needs.


One of the most common errors when web scraping is the http 403 forbidden error. This usually happens because the website detects that you are a scraper, and can be resolved by using a fake user agent, implementing session management, or using the time.sleep() function to add a delay between requests. However, if you get an error like Socket exception: Connection reset by peer (104), the website's admin may have blocked you due to some weird or heavy scraping code.


For more on web scraping with python for beginners, see the web scraping section of Automate the Boring Stuff with Python

  • It's important to review the website's robots.txt file and terms of service before scraping any website.


R for Digital Marketing

R is a programming language for statistical computing, developed from the earlier version S. It can be expanded with a library of 10,000+ packages, which are add-on features for accomplishing tasks - kind of similar to the vast range of WordPress plugins. This GitHub repository has a good list of R packages for digital marketing so that you can interact with well-loved tools such as Google Analytics, GTM, Search Console, and social media ads.


Tidyverse

When you first install R from CRAN, you will have base R, which includes the necessary machinery to run R on your computer, as well as standard R packages - stats, utils, graphics. Next, download and install tidyverse with the following command.

install.packages("tidyverse")

Tidyverse was born in 2014, and is becoming increasingly mature. It includes the following core packages:

  • readr: read data

  • tidyr: tidy data

  • dplyr: transform data and work on relational databases

  • ggplot2: plot data and create static charts for visualization ※ For rich interactive charts, use htmlwidget instead

.

Working in R is like a spreadsheet, but you interact with it through programming functions instead of keyboard shortcuts or clicking around with your mouse. From automating reports to analyzing A/B tests, R offers a blue ocean of uses for digital marketing.


Structural Equation Modeling

Structural Equation Modeling (SEM) using the lavaan package is a broad topic within the R statistical programming language, building on linear regression described above. One marketing application of SEM is customer survey validation. Survey responses are often interrelated in complex ways, and you can use R to reduce data fields and analyze how those underlying latent variables are related to one another.


Choice Model

Choice models are used to predict the market share for new products. This is an analysis of how customers think about feature prioritization: product, pricing, and ultimately profile design. If the product designer is creating multiple products, they can use the choice model to find the optimal coverage of as many important priorities as possible.


In this workshop by the authors of R for Marketing Research and Analytics, Chris Chapman provides an in-depth example of structural equation modeling in the first half, and Elea McDonnell Feit provides a detailed explanation of choice modeling in the second half.





The purpose of this post is not to provide the answers, but rather to share some study notes for passing the PL-300 exam on the first attempt.


Data Framework

Power BI Desktop vs Power BI Service venn diagram
  • Power BI Service: a cloud-based service, or SaaS that supports report editing and collaboration for teams and organizations

  • Power BI Desktop: free application that you download and install on your local computer

  • Storage mode: property that you can set on each table in your model, controls how Power BI caches the table data 

    • Configure in Power BI Desktop Model view

  • Import: allows you to create a local Power BI copy of your semantic models from your data source

    • Can use all Power BI service features with this storage mode, including Q&A and Quick Insights

    • Data refreshes can be scheduled or on-demand

    • Default mode for creating new Power BI reports. 

  • DirectQuery: direct connection to the data source

    • Data won't be cached

    • Ensures that you're always viewing the most up-to-date data, and that all security requirements are satisfied.

    • Solves data latency issues for large semantic models 

    • Automatic page refresh in Power BI Desktop and Power BI Service

      • Fixed interval: update all visuals in a report page based on a constant interval such as one second or five minutes

      • Change detection: refresh visuals on a page based on detecting changes in the data rather than a specific refresh interval

  • Dual/Composite: can have both import and DirectQuery

    • Hybrid tables combine changing real-time data from DirectQuery with static old data from import

    • Tables can act as either cached or not cached, depending on the context of the query that's submitted to the semantic model

    • Setting the related dimension tables to dual storage mode will allow Power BI to satisfy higher-grain queries entirely from cache

  • Incremental Refresh: refresh large semantic models quickly and as often as needed, without having to reload historical data each time 

    • Requires Power BI Pro

    • Supported data sources: SQL Server, Oracle, PostgreSQL, MySQL, Teradata

    • How to setup:

  • Define filter parameters RangeStart and RangeEnd configure the start and end of where Incremental refresh should occur 

  • Use the parameters to apply a filter

  • Define the incremental refresh policy

  • Publish changes to Power BI Service


  • Query caching: instructs the Power BI Premium or Power BI Embedded capacity to use its local caching service to maintain query results, avoiding having the underlying data source compute those results

    • Respects bookmarks, filters, security labels

  • Settings

    • Filters Query reduction: instantly apply basic filter changes

    • Persistent filters of Report: prevent users from saving filters in the Power BI service

    • Query reduction

      • Reduce number of queries: disables the default behavior that automatically applies cross-highlighting and filtering of visuals within the same report

      • Slicers Query reduction: instantly apply slicer changes and add an Apply button to each slicer

Snowflake Schema


star schema diagram


snowflake schema diagram

Deployment Pipeline tool: enables users to manage the development lifecycle of content within their tenant

  • Available within the Power BI Service with a Premium Capacity license

Deployment workflow

Data Connections

  • Azure Data Box: solution for migrating data to Azure

  • VPN: provides connectivity to online data sources such as Azure virtual network

    • Not installed locally and only works with data sources secured by virtual networks

On-premises data gateway flow
  • Power BI gateway (on-premises data gateway): quick and secure data transfer between on-premises data, which is data that isn't in the cloud, and several Microsoft cloud services such as Power BI, Power Apps, Power Automate, Azure Analysis Services, and Azure Logic Apps

    • Personal mode only allows one user to access multiple data sources


Data Profiling Tools

  • Column profile: in-depth column statistics chart

  • Column distribution: provides a set of visuals underneath the names of the columns that showcase the frequency and distribution of the values in each of the columns

  • Column quality: labels values in rows in five categories: valid, error, empty, unknown, and unexpected error

Screenshot of Power BI column quality

DAX calculations

  • Calculated column: calculation used to add a column to a tabular model by writing a DAX formula that returns a scalar value, and is evaluated in row context

    • Can be added to an Import or DirectQuery storage mode table.

    • Can be used in a slicer to place filter options on the report page (“Filters on this page”)

  • Measure: calculation that is performed on the fly, based on the data in the columns of the table 

    • Can only be placed per visual, in the "Filters on this visual" well of the Filters pane 

  • Compound measure: measure that references another measure 

  • Implicit measure: an automatically generated calculation to summarize column values 

    • End-users can select from one of nine aggregations when placed in the Values well of a visual

    • Both implicit and explicit measures can be used as a Drillthrough field, to create quick measures, and with Field Parameters

  • Quick measure: feature in Power BI Desktop that eliminates the need to write DAX formulas for commonly defined measures 

    • Ex. average per category, rank, and difference from baseline 

  • Apply calculations to fields 

  • * Time intelligence functions have performance implications and are disabled for quick measures against DirectQuery tables 

  • Calculated table: model calculation used to add a table to a tabular model by writing a DAX formula that returns a table object that uses Import storage mode.

    • Store data inside the model, so adding them results in a larger model size 

    • Not evaluated in any context 

    • Duplicates data, not including model configurations such as model visibility or hierarchies


DAX functions

  • ALL: returns all the rows in a table, or all the values in a column, ignoring any filters that might have been applied

  • ALLEXCEPT: Removes all context filters in the table except filters that have been applied to the specified columns

  • CALCULATE: provides the result of the calculation with the ability to override context

  • CALENDAR: returns a table with a column named Date that contains a contiguous set of dates based on the start and end dates you specify

  • CALENDARAUTO: returns a table with a column named Date that contains a contiguous set of dates based on data in the model

  • COUNTROWS: summarizes a table by returning the number of rows 

  • CROSSJOIN: returns a Cartesian product of all rows from all tables that the function references

  • EXCEPT: returns rows from one table which do not appear in another table

  • FILTER: returns a table that represents a subset of another table or expression

  • IGNORE: omits specific expressions from the BLANK/NULL evaluation

  • PATH: returns a string with identifiers of all the parents of the current identifier, which is used for flattening

  • PATHITEM: returns the item at the specified position of a string, which is also used for flattening

  • RELATED: returns a related value from another table

  • SUMX: returns the sum of an expression evaluated for each row in a table

  • USERELATIONSHIP: activate an inactive relationship


M Functions

  • date: creates a date value based on the data parameters you specify

  • duration: allows you to specify the datetime values that will be entered into individual rows of a date table

  • List.Combine(): combines multiple lists into one

  • List.Durations: returns a list of count duration values

Reports

  • Options 

    • Pin visual: pin the visual to an existing dashboard or create a new one 

    • Copy visual as image: copy a visual as an image to Clipboard 

    • Export data: export data in xlsx or csv 

    • Spotlight: highlight a visual on the report page 

  • Interaction behaviour 

    • Filter: shows you the filtered data in this visual 

    • Highlight: the default interaction between visuals, shows you both the unfiltered and filtered values in the visual, for comparison purposes 

    • Drillthrough: page navigation experience that takes you from one page to another plus applies a set of filters to page navigated to 

    • Expand: a way to navigate down a level using the hierarchy controls

  • Accessibility: consider how report consumers with no-to-low vision or other physical disability can fully experience the reports 

    • Use contrasting colors, clear and large-sized fonts, well-spaced and large visuals, intuitive navigation  

  • Form factor: size of the hardware that is used to open reports and page orientation (portrait or landscape)

  • Dashboard: canvas of report elements that can be built in Power BI service, and can connect multiple data sources *unlike reports which can only use one single data source


Analytics & Visualization

  • Features

    • Analyze in Excel: create an Excel workbook containing the entire semantic model for a specific Power BI report and analyze that semantic model in Excel using PivotTables, Pivot Charts, and other Excel features

    • Cross-highlighting: selecting a value in one visual highlights the related data in visuals such as column and bar charts

    • Q&A: create a visual by typing in a question about your data

      • Select "Ask a question about your data"

      • The visual can be pinned to a dashboard without adding it to a report

    • Quick Insights: generate interesting visualizations based on your data


Example cluster chart
Cluster chart example
  • Visuals

    • Clustering: identify a segment (cluster) of data that is similar to each other but dissimilar to the rest of the data 

    • Key influencers: helps you understand correlated factors impacting a particular metric

    • Python: create by first enabling the script visuals option in the Visualization pane in Power BI Desktop

      • No need to actually install python on your computer

    • Q&A: allows end-users to ask natural language questions to create AI-generated charts

    • Smart Narrative: combines natural language text with metrics from your model in sentence forms

    • Decomposition Tree: visualize data between multiple dimensions and drill down in any order

Decomposition tree example
Decomposition tree example
  • Personalize visuals

    • Enable in either Power BI Desktop or Power BI Service

    • Switch on/off at a page or visual level

    • Perspectives: choose a subset of a model that provides a more focused view

    • End-user modifications:

      • Change the visualization type

      • Swap out a measure or dimension

      • Add or remove a legend

      • Compare two or more measures

      • Change aggregations, etc


Data Administration

  • Lineage view: view and troubleshoot the data flow from source to destination 

    • Simplifies troubleshooting process

    • Allows you to manage workspaces and observe impact of a single change

    • Saves time by simplifying a task of identifying what hasn't been replaced this view

    • Access to admin, member and contributor roles

  • Data alerts: Power BI sends an alert to your Notification center and, optionally, an email 

    • Alerts only work on streaming datasets if you build a KPI, card, or gauge report visual and then pin that visual to the dashboard. 

  • Row-level security (RLS): design technique to restrict access to subsets of data for specific users

    • In a tabular model, RLS is achieved by creating model roles. 

      • Power BI Desktop > Manage roles

    • How to create a role: 

      • Model view - design and implement structure of a dataset and includes the option to create a role 

      • Report view - provides the ability to manage roles, including their creation 

    • Roles have rules, which are DAX expressions to filter table rows. 

    • Use Power BI Desktop or Power BI service to test RLS 

      • USEROBJECTID() returns the SID of the sign-in user but not their name 

      • USERPRINCIPALNAME() returns which user is signed in to view a report 

    • RLS is not required on a dataset to become discoverable

  • Performance analyzer: see and record logs that measure how each of your report elements performs when users interact with them and which aspects of their performance are most (or least) resource intensive

    • Before running Performance analyzer, to ensure the most accurate results in your analysis (test), start with a clear visual cache and a clear data engine cache

      • To clear the visual cache, add a blank page to the .pbix file and select it

      • To clear the data cache, either restart Power BI Desktop or connect DAX Studio to the semantic model and then call Clear Cache 

  • Endorse semantic models 

    • Promotion: promote your semantic models when they're ready for broad usage 

      • Promoted dataset can be configured to be discoverable for users without access to request permissions to access

    • Certification: request certification for a promoted semantic model 

      • Certified dataset can be configured to be discoverable for users without access to request permissions to access

      • Certification can be a highly selective process, so only the truly reliable and authoritative semantic models are used across the organization    

  • Workspace settings

    • Workspace OneDrive: configure a Microsoft 365 group whose SharePoint Online document library is available to workspace users once the workspace is created

    • Allow contributors to update the app: provide additional permissions for workspace contributors

    • Develop a template: set up a template app workspace

    • License mode: choose between Pro, Premium per user, Premium per capacity, and Embedded licensing

  • Tenant: the container for items of your Organization such as users, domains, subscriptions etc.

Power BI tenant diagram
  • Workspace roles (privilege high to low)

  • Admin: can update and delete workspace, add or remove anyone

  • Member: can add members, un/publish and change permission for apps

  • Contributor: can edit, delete, and publish content (reports)

  • Viewer: can view and interact with an item, read data that's stored in dataflows


Practice Assessment

There is a free practice assessment on Microsoft Learn. To be better prepared for the exam, it is recommended to achieve 80% or higher in multiple attempts, which is what I did as well.


Screenshot of Power BI PL-300 practice assessment results

For each practice assessment that you complete, Microsoft Learn provides a score breakdown and detailed explanations for every question.

Screenshot of Power BI PL-300 practice assessment results

Note that Microsoft periodically update the exam content. I took my exam right after the April '24 update, and they had just added some new skill areas: automatic page refresh, personalized visuals, and accessibility.


bottom of page