In Python, there is this built-in function __call__() for a class you can override, this make your object instance callable.

class MyClass:
def __init__(self):
//... some init
def __call__(self, input1):
self.my_function(input1)
def my_function(self, input1):
print(f"MyClass - print {input1}")
my_obj = MyClass()// same as calling my_obj.my_function("haha")
my_obj("haha") // expect to print "MyClass - print haha"

In PyTorch, the nn.module is implemented so that one can treat the module as callable like above, e.g.

class myLayer(nn.Module):
def __init__(self):
super().__init__()
self.layer1 = nn.Linear(10,1)
def forward(self, input_tensor):
return self.layer1(input_tensor)
model = myLayer()
input_tensor = torch.rand((2,10))
//treat as callable, which is same as model.forward(tensor)
model(input_tensor)…

Background

Adding text to PDF is not something trivial, especially for Chinese when font need to be added to support the display rendering.

One can see the fonts that’s embedded in the PDF to support display of the text

For this article, I am working with True Type Font, and so let’s have some very brief understanding for True Type Font (or fonts in general).

There are following terminology:

  1. Font Collections — a collection of multiple fonts
  2. Font — a collection of character render in a specific (collective) style
  3. Glyphs — the path/curve used to render the character

Based on my understanding, when a character is being rendered, it would need to figure out the glyph to…


Background — What and why of callbacks in framework(s)

One of the important part of a deep learning framework would be the balance between the ease of use and flexibility to change.

With ease of use, I like PyTorch Lightning for their rich features which already encapsulated in the core structure (flow) while one can control the run through config/settings (flags to Trainer object).

The simplified idea of a frameworks and position of callback is as follow:

# the extremely simplified high level structure of training loopfor epoch in epochs:
for batch in dataloader:
model_output = model(x_in_batch)
loss = loss_function(target, model_output)

loss.backward()
optimizer.step()
optimizer.zero_grad()

Imagine what we need…


Background

When I am using Jenkins pipeline, I discover that there is no Source Code Management section like normal project, so seems like there is no GUI way to allow me to define parameter to choice a particular git tag to build from.

“Source Code Management” section in normal project

I have been using Jenkins for automation (in a customer environment) for a while with following use cases:

  1. Continuous deployment
  2. Recurrent network and API verification
  3. API warm up

Continuous deployment

This have been the major use case, the old way of deployment would mainly manual work, and that would be challenging to avoid all human mistakes.

In deployment, there could be following actions:

  1. Get source code from repository
  2. Compile and link (for some programming language)
  3. File copy
  4. Change of configurations (depends on destination or build)
  5. Restart some services

My setup of a Jenkins project for deployment

First, under General section, I setup some parameters, like:

  1. the chosen git tag

Background and how I discovered this

I discovered this when the independence security scanning emailed me saying my VM on cloud having port 27017 opened.

I have had a MongoDB docker image running on that VM and with the simple 27017:27017 port mapping, and I used to believe I haven’t open the port on VM firewall, but when I telnet, I confirmed I can access that port remotely, that is something I think I need to fix.

Solution

All credit goes to this git comment:

The quick idea is docker added a special firewall rule and expose all exposed ports.

The below solution is copied from the…


Background and objective

I have been working on a website that need to

  1. generate a SVG base64 text at the backend API (logic on how to generate the SVG is not supposed to be exposed and so put to backend)
  2. Add some display element on the SVG at the frontend (the backend generated SVG is missing some elements)
  3. And finally to support displaying and downloading of the resulting SVG at frontend.

My (maybe not so well) solution direction

With quite some exploration of what might and might not work, the idea would be leveraging canvas as a middle ground for rendering the SVG (as background image) as well as using…


Intention and Background

There are time when we want to search similar image, the search algorithm would not know the “attention” from the crowded information in input image.

So I am thinking to have a web interface to allow user to select portion of image as search input.

What it does

User load an image (appear on the left), and then generate superpixels and user can click on those superpixels to “select” them, and the selected portion would be display on the right.

The image “Stylish in Marietta” by tvdflickr is licensed under CC BY 2.0

Upon clicking search button, the “selected portion” would be used to search for similar images and display a sliding drawer (from left)


Objective

There are a lot of time I slice some portion of data from multi-dimension vector/tensor. The Numpy array and PyTorch tensor make it very easy to slice, and with a very similar syntax. In some scenario I might need to work from a list, and here comes one implementation that can be done.

Example 1

A simple 2D number list, I want to slice the input into 3 list like elements

from:
[[ 1, 2, 3, 4, 5],
[11,12,13,14,15],
[21,22,23,24,25]]
into:
[[ 1, 2, 3], | [4, | [7,
[11,12,13], | 5, | 8,
[21,22,23]] | 6] | 9]

Numpy array

The syntax is…


Intention

There have been cases that I have some dataset that’s not strictly numerical and not necessary fit into tensor, so I have been trying to find a way to manage my data loading beyond passing the input to PyTorch Dataloader object and let it automatically sample the batches for me, and I have been doing that multiple times and so I would like to study a bit deeper and share it here as a record for my future reference.

Main Reference

PyTorch official reference:

Main Classes / function(s)

Dataset (and their subclasses)

This is not always necessary, especially our dataset normally are in form of list, Numpy array and tensor-like…

Stephen Cow Chau

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store