Exploring Pandas Read Excel File Error

Problem Description Using Pandas’ read_excel method to read an Excel file with 160,000 rows results in an AssertionError: "/Users/XXX/excel_test/venv/lib/python3.7/site-packages/xlrd/xlsx.py", line 637, in do_row assert 0 <= self.rowx < X12_MAX_ROWS AssertionError Underlying Principle Excel files come in two default formats. Before Excel 2007, files used the .xls format, a specific binary format supporting up to 65,536 rows (16,384 rows before Excel 97) and 256 columns. Starting with Excel 2007, a new XML-based format .xlsx was adopted, supporting up to 1,048,576 rows and 16,384 columns. Note that when converting a .xlsx file to .xls, data beyond 65,536 rows and 256 columns will be lost. ...

August 22, 2019 · 3 min · Zhiya

Solution Approach for Using Proxy in Docker Build

Problem Description When using docker build to package an image, there is a need to access the network via a proxy. The following Dockerfile simulates this scenario: FROM golang:1.12 RUN curl www.google.com --max-time 3 In a typical network environment in China, curl www.google.com cannot return normally. The --max-time option is added to ensure the curl command does not take too long. Configuring the http_proxy Variable First, you need to set the http_proxy and https_proxy environment variables so that network access commands (represented here by curl) can access www.google.com through the proxy server configured in the environment variables. ...

April 18, 2019 · 3 min · Zhiya

Understanding the Behavior of the count Function in PostgreSQL

The use of the count function has always been a topic of debate, especially in MySQL. As PostgreSQL gains popularity, does it have similar issues? Let’s explore the behavior of the count function in PostgreSQL through practical experiments. Building a Test Database Create a test database and a test table. The test table includes three fields: an auto-increment ID, a creation time, and content. The auto-increment ID field is the primary key. ...

April 16, 2019 · 7 min · Zhiya

JWT Pitfalls Guide: Solving the nbf Verification Failure Issue

Phenomenon A freshly issued JWT becomes invalid when used in the next request, resulting in a 422 error. { "msg": "The token is not yet valid (nbf)" } If you wait a few seconds before making the request again (for example, using Chrome Developer Tools’ Replay XHR), it succeeds. Principle of the nbf Field Looking at the error message above, you will notice an nbf, which is a field in the JWT protocol. It stands for Not Before, indicating that the JWT Token is invalid before this time, and is generally set to the issuance time. This raises a hypothesis: in a multi-server environment, if the servers’ times are not synchronized, a token issued by one server might fail verification on another server due to the nbf field. The JWT protocol has already considered such issues, and it specifically mentions using a small leeway to address this in the nbf section. ...

March 26, 2019 · 4 min · Zhiya

Solution Approach for Uploading Large Files Error in nginx + ingress + gunicorn Environment

In a Python Web application deployed on Kubernetes and running with Gunicorn, a series of errors occurred when uploading large files. Here, I document the solution approach. File Upload Process File upload flow: The uploaded file first reaches the host machine where Kubernetes is running. Nginx on the host machine forwards it via Proxy to the Ingress Controller in the Kubernetes cluster, which is also implemented using Nginx. Nginx in the Ingress Controller forwards it via Proxy to Gunicorn. Gunicorn starts several Workers to handle requests, so it forwards the request to a Worker. The Worker is the final Python Web App. Solving Error 413 The first encountered error was 413 Request Entity Too Large. During the upload process, the connection was interrupted (almost always at the same upload percentage), and the request returned 413. Initially, I considered the possibility of Nginx having a request body size limit. Checking the Nginx documentation, I found that the client_max_body_size parameter controls the request body size, with a default setting of 1mb. ...

February 28, 2019 · 4 min · Zhiya