Background and how I discovered this

I discovered this when the independence security scanning emailed me saying my VM on cloud having port 27017 opened.

I have had a MongoDB docker image running on that VM and with the simple 27017:27017 port mapping, and I used to believe I haven’t open the port on VM firewall, but when I telnet, I confirmed I can access that port remotely, that is something I think I need to fix.

Solution

All credit goes to this git comment:

The quick idea is docker added a special firewall rule and expose all exposed ports.

The below solution is copied from the…


Background and objective

I have been working on a website that need to

  1. generate a SVG base64 text at the backend API (logic on how to generate the SVG is not supposed to be exposed and so put to backend)
  2. Add some display element on the SVG at the frontend (the backend generated SVG is missing some elements)
  3. And finally to support displaying and downloading of the resulting SVG at frontend.

My (maybe not so well) solution direction

With quite some exploration of what might and might not work, the idea would be leveraging canvas as a middle ground for rendering the SVG (as background image) as well as using…


Intention and Background

There are time when we want to search similar image, the search algorithm would not know the “attention” from the crowded information in input image.

So I am thinking to have a web interface to allow user to select portion of image as search input.

What it does

User load an image (appear on the left), and then generate superpixels and user can click on those superpixels to “select” them, and the selected portion would be display on the right.

The image “Stylish in Marietta” by tvdflickr is licensed under CC BY 2.0

Upon clicking search button, the “selected portion” would be used to search for similar images and display a sliding drawer (from left)


Objective

There are a lot of time I slice some portion of data from multi-dimension vector/tensor. The Numpy array and PyTorch tensor make it very easy to slice, and with a very similar syntax. In some scenario I might need to work from a list, and here comes one implementation that can be done.

Example 1

A simple 2D number list, I want to slice the input into 3 list like elements

from:
[[ 1, 2, 3, 4, 5],
[11,12,13,14,15],
[21,22,23,24,25]]
into:
[[ 1, 2, 3], | [4, | [7,
[11,12,13], | 5, | 8,
[21,22,23]] | 6] | 9]

Numpy array

The syntax is…


Intention

There have been cases that I have some dataset that’s not strictly numerical and not necessary fit into tensor, so I have been trying to find a way to manage my data loading beyond passing the input to PyTorch Dataloader object and let it automatically sample the batches for me, and I have been doing that multiple times and so I would like to study a bit deeper and share it here as a record for my future reference.

Main Reference

PyTorch official reference:

Main Classes / function(s)

Dataset (and their subclasses)

This is not always necessary, especially our dataset normally are in form of list, Numpy array and tensor-like…


When running code in Colab, there are occasion I need to debug code that’s not develop by me but from installed packages, and it’s impossible alter code inside.

When code is written by myself, it’s easy to add code:

from pdb import set_trace; set_trace()

So when code run to this line, it would trigger the debugging:

Problem

Note that when debugging, hitting “n” (next) does not lead us to next line in code (line 5), one would need to experiment with some “n” or “s” (step) get to the line:


I admire the effort of this article, if you want to embed a interactive graph/plot, follow the way this article shared.

I found at least 2 (if not more) providers that help host your plotly on cloud and so you can embed in your website, but for me sometimes an image can serve my purpose, if the resolution is high enough.

The simplest way to get an image from a plotly is the download as image button they provided

But sometimes when the information is packed, I would expect some thing with a little bit higher resolution.

And the good…


This is a quick summary on using Hugging Face Transformer pipeline and problem I faced.

Pipeline is a very good idea to streamline some operation one need to handle during NLP process with their transformer library, at least but not limited to:

  1. tokenize the input string
  2. map tokens to IDs (integer)
  3. pass the mapped id as tensor to model

The old way before pipeline:

# Load pretrained model/tokenizer
from transformers import DistilBertModel, DistilBertTokenizer
model_class, tokenizer_class, pretrained_weights = (DistilBertModel, DistilBertTokenizer, 'distilbert-base-uncased')
tokenizer = tokenizer_class.from_pretrained(pretrained_weights)
model = model_class.from_pretrained(pretrained_weights)
input_ids = torch.tensor([tokenizer.encode("this is a test")])
with torch.no_grad(): …


Background

I have this Python project that run multiple web services on Flask. I used to debug very inefficiently with adding pdb set_trace lines or print message to trace, but with Visual Studio Code as my major IDE (mainly for Javascript), I would like to leverage it to make my Python programming more efficient.

Challenges

I use Windows, but I prefer Linux when I develop, so I use WSL (Window Linux Subsystems)

I use Conda to manage my Python package, and the environments are all in file system inside WSL.

When using Visual Studio Code, it can detect the WSL environment for…


This is a browser extension that generate QR code for selected text or current page URL.

Intention and Background

This is a project that I rethink about the OCR solution in:

Problem in OCR solution was the accuracy and the speed is not satisfactory, and so consider the same problem to copy text from PC and use it in mobile phone, one way was to use a middle man like Signal, WhatsApp, email…But my take is to do it with camera (OCR in previous solution or QR code scanning in this solution).

What it does

It display the QR code of the current URL or selected…

Stephen Cow Chau

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store