I discovered this when the independence security scanning emailed me saying my VM on cloud having port 27017 opened.
I have had a MongoDB docker image running on that VM and with the simple 27017:27017 port mapping, and I used to believe I haven’t open the port on VM firewall, but when I telnet, I confirmed I can access that port remotely, that is something I think I need to fix.
All credit goes to this git comment:
The quick idea is docker added a special firewall rule and expose all exposed ports.
The below solution is copied from the…
I have been working on a website that need to
With quite some exploration of what might and might not work, the idea would be leveraging canvas as a middle ground for rendering the SVG (as background image) as well as using…
There are time when we want to search similar image, the search algorithm would not know the “attention” from the crowded information in input image.
So I am thinking to have a web interface to allow user to select portion of image as search input.
User load an image (appear on the left), and then generate superpixels and user can click on those superpixels to “select” them, and the selected portion would be display on the right.
Upon clicking search button, the “selected portion” would be used to search for similar images and display a sliding drawer (from left)
There are a lot of time I slice some portion of data from multi-dimension vector/tensor. The Numpy array and PyTorch tensor make it very easy to slice, and with a very similar syntax. In some scenario I might need to work from a list, and here comes one implementation that can be done.
A simple 2D number list, I want to slice the input into 3 list like elements
[[ 1, 2, 3, 4, 5],
[[ 1, 2, 3], | [4, | [7,
[11,12,13], | 5, | 8,
[21,22,23]] | 6] | 9]
The syntax is…
There have been cases that I have some dataset that’s not strictly numerical and not necessary fit into tensor, so I have been trying to find a way to manage my data loading beyond passing the input to PyTorch Dataloader object and let it automatically sample the batches for me, and I have been doing that multiple times and so I would like to study a bit deeper and share it here as a record for my future reference.
PyTorch official reference:
This is not always necessary, especially our dataset normally are in form of list, Numpy array and tensor-like…
When running code in Colab, there are occasion I need to debug code that’s not develop by me but from installed packages, and it’s impossible alter code inside.
When code is written by myself, it’s easy to add code:
from pdb import set_trace; set_trace()
So when code run to this line, it would trigger the debugging:
Note that when debugging, hitting “n” (next) does not lead us to next line in code (line 5), one would need to experiment with some “n” or “s” (step) get to the line:
I admire the effort of this article, if you want to embed a interactive graph/plot, follow the way this article shared.
I found at least 2 (if not more) providers that help host your plotly on cloud and so you can embed in your website, but for me sometimes an image can serve my purpose, if the resolution is high enough.
The simplest way to get an image from a plotly is the download as image button they provided
But sometimes when the information is packed, I would expect some thing with a little bit higher resolution.
And the good…
This is a quick summary on using Hugging Face Transformer pipeline and problem I faced.
Pipeline is a very good idea to streamline some operation one need to handle during NLP process with their transformer library, at least but not limited to:
The old way before pipeline:
# Load pretrained model/tokenizer
from transformers import DistilBertModel, DistilBertTokenizer
model_class, tokenizer_class, pretrained_weights = (DistilBertModel, DistilBertTokenizer, 'distilbert-base-uncased')
tokenizer = tokenizer_class.from_pretrained(pretrained_weights)
model = model_class.from_pretrained(pretrained_weights)input_ids = torch.tensor([tokenizer.encode("this is a test")])
with torch.no_grad(): …
I use Windows, but I prefer Linux when I develop, so I use WSL (Window Linux Subsystems)
I use Conda to manage my Python package, and the environments are all in file system inside WSL.
When using Visual Studio Code, it can detect the WSL environment for…
This is a browser extension that generate QR code for selected text or current page URL.
This is a project that I rethink about the OCR solution in:
Problem in OCR solution was the accuracy and the speed is not satisfactory, and so consider the same problem to copy text from PC and use it in mobile phone, one way was to use a middle man like Signal, WhatsApp, email…But my take is to do it with camera (OCR in previous solution or QR code scanning in this solution).
It display the QR code of the current URL or selected…