Six-ish Things I Learned This Week | CW16 CY2020

This is the first edition of Six-ish Things I Learned This Week. Every week I come across many new things be it in programming, networking or powerpoint. In an effort to not forget all these things this is my attempt to write them down in a concise and searchable way. Maybe you find something interesting as well.

This week is extremely python-heavy as I haven't been doing anything but writing python code in the past seven days. Let's see if this changes for the next edition. I am not ruling out any topics but without further ado, here are a few cool things I (re)discovered or learned this week.

Table of contents

What does YAML stand for?
Versioning python packages with versioneer
Easily rate limiting flask apps
Getting the trackingId from a failed request to the Webex Teams API when using the webexteamssdk
How Slicing works in python3
Validating data models with pydantic

What does YAML stand for?

Let's start this off with something slightly embarrassing. After dealing with and using YAML files for at least five years now I could have sworn YAML stands for Yet Another Markup Language. During a virtual DevNet Express event this week I discover that it, according to the official yaml.org webpage, means YAML Ain't Markup Language. You never stop learning four-letter acronyms I guess.

Versioning python packages with versioneer

Versioning a package or application can be a bit of a pain. Especially if you are releasing frequently and want your git commit hash to be part of a version string. This one is technically a cheat in this category as I have known about versioneer, a python package to automate the process of versioning python packages based on commits and tags in your version control system, for some time but never actually used it. So that I don't forget it in the future, here is how you can add a versioneer versioning to your python package.

Install versioneer

$ pip install versioneer

First, add a [versioneer] section to your setup.cfg. This section will specify things like which VCS to use, what version style to use and where to put the version file in the source tree.

[versioneer]
VCS = git
style = pep440
versionfile_source = <path_to_your_package_root>/_version.py
tag_prefix = v

Next, we need to tell our setup.py about versioneer and that this is the version number that should be used. To do so modify your setup.py to import versioneer and also use it as the versio string.

import versioneer
# [...] All other imports 

setup(
    version=versioneer.get_version(),
    cmdclass=versioneer.get_cmdclass(),
    # [...]
    # Rest of your setup function follows here. 
    # Make sure to delete any other 'version'
    # or cmdclass args.
)

Now as a final step we need to run versioneer install in the root directory of your project.

$ versioneer install

This will copy the versioneer.py script over, create a _version.py file in the source code directory (specified in the configuration by versionfile_source) and adds a __version__ varibale to your packages init file. You can now call <name_of_your_package>.__ version__ to access your version number.

This configuration will use the versioning style specified in PEP 440. Most basically the version number will be the git tag. Now if you are a 5 commits ahead of the last tag and this will be indicated by +5 that is appended to the tag. To be able to identify which commit you are on the commit hash shorthand will also be append.

So your version number for a package build from commit 9cd166ecd1d25d1855801ebc05eaa6fa9d7779ae2705a8637a612b92c05ea5d7 that is 6 commits ahead of the last tag, which was 1.0, would be v1.0+9.cd166e. If your local directory additionally contains uncommitted files this will be acknowledged by appending a .dirty. So above version with additional uncommitted files would have the version string v1.0+9.cd166e.dirty.

Easily rate limiting flask apps

When building a REST API you might come to the point where you want to Rate limit the access to your api. Rate limiting means that you put a limit on how many requests a user can make in a given timeframe and it's super useful to keep your API from being flooded with requests that could lead to deteriorating performance or completely unresponsive endpoints.

If you are using flask to build your API there is a package to introduce these limits in a very convenient fashion. The Flask-Limiter package is a MIT-licensed flask extension developed by @alisaifee on github that lets you decorate flask endpoints to have rate limits.

Lets have a look at how to limit a flask endpoint to 5 calls per second and 60 call per minute.

First, install the package by running

$ pip install flask-limiter

With that done we can start rate-limiting our endpoints.

from flask import Flask, jsonify
from flask_limiter import Limiter
from flask_limiter.util import get_remote_address

app = Flask(__name__)
limiter = Limiter(app, 
                  key_func=get_remote_address,
                  default_limits=["2 per minute", "1 per second"]
                 )

@app.route("/api/v1/random_number")
@limiter.limit("5/second;60/minute")
def api_random_number():
    return jsonify({"number": 4})

As you can see we specify the limits in a human-readable way with the @limiter.limit decorator. This example uses the remote address as the key to identify a user (provided by the get_remote_address function). You can also write your own function to, for example, limit the number of requests per api key or per user that might be stored in your session or elsewhere or make the number of requests a user has done persistent between restarts by using redis or memcache as a storage backend. Check the documentation here for all the capabilities this cool little package has.

Getting the trackingId from a failed request to the Webex Teams API when using the webexteamssdk

If I had to choose a python package that has had the biggest impact on my (work) life in the past two years it has to be Chris Lunsfords incredible python SDK for the Webex Teams REST API. Chris has done an outstanding job of making dealing with the API like acting upon local objects in python.

As an example, instead of doing a POST request to /v0/messages with a json body containing a destination mail address and message as well as setting the authorisation headers correctly I can just do

from webexteamssdk import WebexTeamsAPI

api = WebexTeamsAPI()
api.messages.create(toPersonEmail="mn@mail.com", markdown="**test message**")

Need more convincing how powerful this is? Let's send a message to everyone that is a member of a specific space (or room). And lets do that in 5 lines of code (I have omitted the two lines needed to import and instantiation the api object in the example below):

ROOM_ID = "..."
for membership in api.memberships.list(roomId=ROOM_ID):
    api.messages.create(toPersonEmail=membership.personEmail, markdown="Thanks for being part of the space")

The SDK takes care of returning me a iterable object containing memberships as well as dealing with pagination and retry after headers when we send too many requests at once.

Now if a request fails one thing you might get asked for by the Webex API support team is a trackingId so that they can trace your request through the system. This trackingId, however, is not immediately visible from the ApiError exception that gets raised by the webexteamssdk. You can access that trackingId by checking the raw response that lead to the error:

from webexteamssdk import WebexTeamsAPI, ApiError

api = WebexTeamsAPI()

try:
    api.messages.create(toPersonEmail="mn@mail.com", markdown="**test message**")
except ApiError as e:
    tracking_id = e.response.json()['trackingId']
    print(f"Tracking Id: {tracking_id}")

How Slicing works in python3

Ever wondered how slicing, the ability in python to select a range of indexes from a list (l[2:4]) works? There is no better way then finding out by implementing slicing on a custom class.

The beauty of python is that everything is an object and you can overwrite your own classes to respond to build-in operators like +, - or, and this is what we want to do here, the index operator [0].

I recently wanted to abuse this system. See, for a little project I want to be able to not implement list-like behaviour but rather extract the start and end of a slice. What I was trying to achieve was the following

>>> c = Custome()
>>> c[4]
Accessed key is 4
>>> c[0:4]
Start of the slice is 0, end of the slice is 4

Every operator in python3 is given by specifying a magic function, these are methods beginning and ending with two underscores, that python calls on your objects when invoking a operator of, for example, cast operations. The most prominent of these that every django developer knows is the __str__ function that returns the string representation of an object and allows the str() function to be called on a object.

The function used for acting on the [] operator is __get_item__. This class-method takes a key argument that can either be a literal key (think l[3] giving you 3 as the key) or a slice object. So we can act upon these differently.

class Custome:
    def __get_item__(self, key):
        if isinstance(key, slice):
            print(f"Start of the slice is {slice.start}, end of the slice is {slice.end}")
        else:
            print(f"Accessed key is {key}")

As you can see I am checking if the key is an instance of the build-in slice object. If this is the case I can access the slice.start and slice.end entries in that object to extract my slice range. For the record there would also be a slice.step if you were to specify a step size (i.e. when you call the slice with l[1:3:0.5] the 0.5 would be the step size similar to how the build-in range function works).

Pretty cool python, pretty cool!

Validating data models with pydantic

The fact that python is not statically typed is both a curse and a blessing. If a programming language is statically typed (examples would be Java or C++) that means that the type, be it int, string or a custom class, is known at compile time. Python is, by default, dynamically typed which means that you can declare a variable without specifying a type. The python interpreter will infer the type of the variable based on it's value. So when you type below code example

>>> slogan = "Python is great"
>>> print(type(slogan))
<class 'str'>

python will, in the background, infer that slogan is a string and has that type. Now, this is great because it means that you don't have to worry about variable types saving you a few key strokes every time you declare a variable. What makes it tricky however is that, technically speaking, every time you write a function that has some parameters you have to check that that parameter is of the type you are expecting. To illustrate:

def do_something(str_input):
    split = str_input.split()
    return " ".join([s.capitalize() for s in split])
# Case number 1
print(do_something("this is a sentence"))
# Case number 2
print(do_something(2))

This piece of code works perfectly fine in case number 1 and capitalizes the first letter of every word in the passed sentence. Case 2, where we pass the number 2 instead of a string, fails with a AttributeError since we are using the split function which is a build-in function of the str data type. A manual way to fix this would be to check if str_input is a instance of str.

def do_something(str_input):
    if isinstance(str_input, str):
        split = str_input.split()
        return " ".join([s.capitalize() for s in split])

Now this is cumbersome and inflexible at the same time. The python developers came to the same conclusion and, starting with version 3.5, python supports type hints. PEP484 defines how to give the python interpreter a hint what type your function is expecting. Above function would, with type hints, look like this:

def do_something(str_input: str) -> str:
    split = str_input.split()
    return " ".join([s.capitalize() for s in split])

Now, while this is a great hint to you as a developer, PEP484 explicitly states in it's non-goals that type checking at runtime is left to third-party packages so even with above type hints a number can still be passed without causing a runtime failure until calling the split function on it.

One such third-party package is pydantic. With this little tool you can write python data classes who's variables will be checked to match a specified type. To see this in action here is a (truncated) Device class I wrote for a little project this week:

from pydantic import BaseModel
from typing import List, Optional 

class SipUrl(BaseModel):
    pass

class Device(BaseModel):
    id: str
    displayName: str = None
    sipUrls: List[SipUrl] = []

d = Device(id="test_id")

In the Device class, the id is specified as a member variable of type string and since there is no default value provided is not optional. Trying to instantiate a Device object without specifying a id will result in a ValidationError that specifies that the id field is required but missing. The other two variables, displayName and sipUrls, have default values and work with custom classes as well as containers like lists.

Pydantic also adds some convenience functions like a .dict() and .json() function to turn your data model into a dictionary of json. The inverse functions are also present. You can use the parse_obj, parse_raw and parse_file functions to create a object from a dict, json string or file handler.

One last cool feature of pydantic is that it will convert variables when possible. Let's say you specify a field to hold ints and pass '2' which is a string. This should fail since '2' is not a int but the value can be parsed into a int automatically which is exactly what pydantic is going to do.

class TestModel(BaseModel):
    id: int

t1 = TestModel(id=1)
t2 = TestModel(id='2')

print(type(t2.dict()['id']))

This code will work and checking the type of the id field will yield a integer.

Category: Six-Ish Things I Learned This Week

Tags: python