This conference includes small sessions of 30min each (25min talk + 5min Q&A).
In order to ease the participation from all over the world we chose to start the conference at 11:30 (GMT).
Himalayan Peaks of testing Data Pipelines
Everybody knows what the pyramid of testing is. But what if you can’t afford the whole pyramid? It’s almost technically and economically impossible to implement the entire pyramid for data pipelines. During this talk, we will discuss who data engineers are, what are data pipelines, and how to test them.
Building Custom Data Applications under Insane Expectations
The speaker spent almost four years building the Data Science and ML teams at Cognite as the company went from nothing to a unicorn. Through those years, he was part of building custom data-driven solutions for companies like Exxon and BP through insanely ambitious digitalization initiatives. This talk is about the challenges, successes, and spectacular failures.
Load Tests of Mobile and Web Applications with locust.io
The locust.io is a library for running load tests. It is a great library with few issues when using it for mobile and web apps. During the implementation of tests using this library, there are issues you should be aware of. Writing down the requests this library initiates is not trivial. Coding them by hand is tedious and time-consuming. We will see how to record network traffic into HAR files and convert them into locust.io scripts.
Using Chaos Toolkit to Determine Resiliency for Your Web App
Chaos engineering can be made easier using the Python-based Chaos toolkit. This session will talk about how to ensure your modern web application is resilient to changes. It will also explain how to install the Chaos toolkit and configure the same to use it for your existing web app.
Elijah ben Izzy
Hamilton — a Novel Approach for Transforming Data in Python
In this talk, we present Hamilton. Hamilton was initially built to solve the problem of managing a codebase of transforms on pandas’ dataframes, enabling a data science team to scale their capabilities with the complexity of their business. Since then, it has grown into a general-purpose tool for writing and maintaining dataflows in python. We introduce the framework, discuss its motivations and initial successes at Stitch Fix, and share recent extensions that seamlessly integrate it with distributed compute offerings, such as Dask, Ray, and Spark.
Django Apps at Scale: Mistakes to Avoid
Developers make tens of mistakes to scale websites to millions of users globally. In my three years of experience working with Django for building scalable solutions, I’ve learned a lot about what works well and what doesn’t with Django, and I hope I can share some useful tips on how to work with this popular web framework. In this talk, You will learn about scaling Django Apps using Microservice Architecture, transitioning from a Monolith to a Microservice, and Mistakes to avoid while building microservices in Django.
Creating Spatial REST APIs in 25 minutes with GeoDjango
This talk presents an easy and effective way to develop REST API using GeoDjango and Django REST framework within a few minutes. These APIs will use filtering such as Distance and Radius or BBOX, etc., to query and return the result in the GeoJSON format. GeoDjango is an amazing plugin built on top of the fast and stable Django framework. In this talk, we’ll walk through the Django REST Framework and the GIS GeoDjango framework. We will see how to create a standard API that takes several parameters or bodies as input to GET, PUT, POST, and DELETE spatial data. We’ll develop spatial queries on top of normal text-based queries, and get the data in the GeoJSON format, which can be utilized directly by mapping JS libraries such as OpenLayers, Leaflet.js, etc. Krishna is the founder of .
Property-Based Testing with Hypothesis: Stronger Tests, Less Work
Automated tests are great. But they’re not free – we all want tests that are good at protecting us from bugs – but to get that, we need to put a lot of work into them. Property-based testing is a technique that saves us a lot of this work. It uses the computer to generate hundreds or even thousands of test cases – so we don’t have to. This helps us find bugs sooner and more easily and have more confidence in our code. This session will point you in the right direction to use property-based tests in your work. We will explore the technique through Python’s excellent Hypothesis framework and review the fundamental concepts, basic usage, and tooling. We’ll also get a feeling for the power and variety of real-world use cases by creating a test that explores a CRUD web application, finding bugs in edge cases we didn’t know. We’ll finish with pointers and resources to help you get started.
Protecting Sensitive Data and Models for Machine Learning
There is a great opportunity to improve your business (even the world) with the application of machine learning. But there are legitimate concerns around protecting AI intellectual property, as well as confidentiality when the data is sensitive. Mike will demonstrate a novel approach to performing predictions on encrypted data so that both the data and model remain protected at all times- even during processing. We’ll demonstrate this by building a working demo based on end-to-end encryption, and confidential computing.
Building Lightning-Fast Apps With asyncio
Modern services must handle vast amounts of traffic efficiently and in a scalable manner. One method of achieving high throughput while keeping things simple is by utilizing concurrent I/O. In this talk, I will share the story of why we designed an asyncio-based Python service, how its performance exceeded that of the Java service it replaced by an order-of-magnitude, and what learnings we gained from it. These learnings can help us design super-fast, highly concurrent services. We will talk about the principle behind asyncio’s efficiency (its secret sauce), when asyncio shines, when you might opt for a different approach, and about how to combine it with other paradigms to maximize your application’s performance.
Unlocking More From Your Audio Data
In a world where content capture and creation sit in your pocket, the amount of audio collected and stored has exploded exponentially in recent years, creating a goldmine of unstructured data ready to be explored and used. Just one problem: how do we work with this audio data and make sure we are utilizing it to its full potential? In this talk, we will explore how you can unlock more from your audio data in python, exploring some of our favorite data extraction and analysis tools and how we used them to understand the world of podcast creation better.