mysql - How to decide the maximal characters count for VARCHAR-like type columns in databases if specified value is actual only

In contrast to CHAR, VARCHAR values are stored as a 1-byte or 2-bytelength prefix plus data. The length

In contrast to CHAR, VARCHAR values are stored as a 1-byte or 2-byte length prefix plus data. The length prefix indicates the number of bytes in the value. A column uses one length byte if values require no more than 255 bytes, two length bytes if values may require more than 255 bytes.

MySQL official documentation

Let us consider this problem on EC store case. Here is how could be defined the product (item) entity (most likely you can understand this code even has not learned the TypeScript):

import { FIXED_CHARACTERS_COUNT_IN_UNIVERSAL_UNIQUE_ID__VERSION_4 } from "fundamental-constants";


type Product = {
  readonly ID: Product.ID;
  label: string;
  price__dollars__withoutTaxes: number;
};


namespace Product {

  export type ID = string;
  export namespace ID {
    export const TYPE: StringConstructor = String;
    export const REQUIRED: boolean = true;
    export const FIXED_CHARACTERS_COUNT: number = FIXED_CHARACTERS_COUNT_IN_UNIVERSAL_UNIQUE_ID__VERSION_4;
  }

  export namespace Label {
    export const TYPE: StringConstructor = String;
    export const REQUIRED: boolean = true;
    export const MINIMAL_CHARACTERS_COUNT: number = 2;
    export const MAXIMAL_CHARACTERS_COUNT: number = 127;
  }

  export namespace Price__Dollars__WihtoutTaxes {
    export const TYPE: NumberConstructor = Number;
    export const REQUIRED: boolean = true;
    export const MINIMAL_VALUE: number = 0;
  }

}

Both inputted data validation on the frontend side and request data validation at backend same as database definition must obey to above business rules. Particularly, the product label must include from 2 to 127 characters:

Assume that above values are never directly inputted twice at both frontend and backend - instead, it is been referred:

<!-- BAD: the maximal characters count has been HARDCODED -->
<label for"PRODUCT_LABEL--INPUT">Please input 2-127 characters.</label>
<input type="text" maxlen="127" id="PRODUCT_LABEL--INPUT" />

<!-- GOOD: the maximal characters count has been referred (no matter what is the template engine)-->
<label for"PRODUCT_LABEL--INPUT">Please input {{ Product.Label.MINIMAL_CHARACTERS_COUNT }}-{{ Product.Label.MAXIMAL_CHARACTERS_COUNT }} characters.</label>
<input type="text" maxlen="{{ Product.Label.MAXIMAL_CHARACTERS_COUNT }}" id="PRODUCT_LABEL--INPUT" />

When defining the database (now matter, how exactly - via raw SQL request, GUI tool or ORM), we also will set the VARCHAR-like type for the label column of the Products tables with 127 characters maximal length (again, by referring to Product.Label.MAXIMAL_CHARACTERS_COUNT instead of direct input of 127 value).

Then, assume that the seller has inputted the product label consists almost 127 characters, but including 2-byte ones. Validation on the frontend has not been threat the inputted value same as validation of request data at the backend. But once the server application will try to save the added (or updated) product to the table, we'll get the exception about label's value is exceeding the maximal length!

Question: which value must be set in Product.Label.MAXIMAL_CHARACTERS_COUNT? (Let me repeat that this value is being referred from both frontend and the backend).

In contrast to CHAR, VARCHAR values are stored as a 1-byte or 2-byte length prefix plus data. The length prefix indicates the number of bytes in the value. A column uses one length byte if values require no more than 255 bytes, two length bytes if values may require more than 255 bytes.

MySQL official documentation

Let us consider this problem on EC store case. Here is how could be defined the product (item) entity (most likely you can understand this code even has not learned the TypeScript):

import { FIXED_CHARACTERS_COUNT_IN_UNIVERSAL_UNIQUE_ID__VERSION_4 } from "fundamental-constants";


type Product = {
  readonly ID: Product.ID;
  label: string;
  price__dollars__withoutTaxes: number;
};


namespace Product {

  export type ID = string;
  export namespace ID {
    export const TYPE: StringConstructor = String;
    export const REQUIRED: boolean = true;
    export const FIXED_CHARACTERS_COUNT: number = FIXED_CHARACTERS_COUNT_IN_UNIVERSAL_UNIQUE_ID__VERSION_4;
  }

  export namespace Label {
    export const TYPE: StringConstructor = String;
    export const REQUIRED: boolean = true;
    export const MINIMAL_CHARACTERS_COUNT: number = 2;
    export const MAXIMAL_CHARACTERS_COUNT: number = 127;
  }

  export namespace Price__Dollars__WihtoutTaxes {
    export const TYPE: NumberConstructor = Number;
    export const REQUIRED: boolean = true;
    export const MINIMAL_VALUE: number = 0;
  }

}

Both inputted data validation on the frontend side and request data validation at backend same as database definition must obey to above business rules. Particularly, the product label must include from 2 to 127 characters:

Assume that above values are never directly inputted twice at both frontend and backend - instead, it is been referred:

<!-- BAD: the maximal characters count has been HARDCODED -->
<label for"PRODUCT_LABEL--INPUT">Please input 2-127 characters.</label>
<input type="text" maxlen="127" id="PRODUCT_LABEL--INPUT" />

<!-- GOOD: the maximal characters count has been referred (no matter what is the template engine)-->
<label for"PRODUCT_LABEL--INPUT">Please input {{ Product.Label.MINIMAL_CHARACTERS_COUNT }}-{{ Product.Label.MAXIMAL_CHARACTERS_COUNT }} characters.</label>
<input type="text" maxlen="{{ Product.Label.MAXIMAL_CHARACTERS_COUNT }}" id="PRODUCT_LABEL--INPUT" />

When defining the database (now matter, how exactly - via raw SQL request, GUI tool or ORM), we also will set the VARCHAR-like type for the label column of the Products tables with 127 characters maximal length (again, by referring to Product.Label.MAXIMAL_CHARACTERS_COUNT instead of direct input of 127 value).

Then, assume that the seller has inputted the product label consists almost 127 characters, but including 2-byte ones. Validation on the frontend has not been threat the inputted value same as validation of request data at the backend. But once the server application will try to save the added (or updated) product to the table, we'll get the exception about label's value is exceeding the maximal length!

Question: which value must be set in Product.Label.MAXIMAL_CHARACTERS_COUNT? (Let me repeat that this value is being referred from both frontend and the backend).

Share Improve this question edited Nov 17, 2024 at 6:57 Progman 19.7k7 gold badges55 silver badges82 bronze badges asked Nov 16, 2024 at 2:29 Takeshi Tokugawa YDTakeshi Tokugawa YD 1,0408 gold badges66 silver badges178 bronze badges
Add a comment  | 

2 Answers 2

Reset to default 0

The length argument for the VARCHAR(L) column specifies how many characters can be saved. This does not include the number of additional bytes needed for the MySQL database to store the value in the table. The quoted documentation only specifies how many bytes are needed additionally to store a value in a VARCHAR column. See the following example:

mysql> CREATE TABLE Dummy (Label VARCHAR(10));
Query OK, 0 rows affected (0.02 sec)

mysql> INSERT INTO Dummy(Label) VALUES('12345');
Query OK, 1 row affected (0.01 sec)

mysql> INSERT INTO Dummy(Label) VALUES('123456789');
Query OK, 1 row affected (0.01 sec)

mysql> INSERT INTO Dummy(Label) VALUES('1234567890');
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO Dummy(Label) VALUES('12345678901');
ERROR 1406 (22001): Data too long for column 'Label' at row 1

As you see it is possible to save the string 123456789 (length of nine) in the VARCHAR(10) column, since 9<=10. It will however require additional 1 byte to save the data.

When you try to save the string 1234567890 (length of ten) in the VARCHAR(10) column, it will works as well since 10<=10. Again, it needs additional 1 byte for the length of the string.

The value 12345678901 cannot be saved since the string has a length of eleven and is too big to save in a column of type VARCHAR(10).

So when you want to save only labels with a maximum length of 127, then use VARCHAR(127). A user will be able to save values with a string up to a length of 127, but no bigger strings.

Keep in mind that the data is stored as characters, not bytes. This means that the value äöüäöüäöü (nine umlauts) can be saved in a VARCHAR(10) column, since 9<=10, even though 18+1 bytes are needed to save the data in the table. See the following SELECT statement:

mysql> SELECT Label, LENGTH(Label) FROM Dummy;
+--------------------+---------------+
| Label              | LENGTH(Label) |
+--------------------+---------------+
| 12345              |             5 |
| 123456789          |             9 |
| 1234567890         |            10 |
| äöüäöü             |            12 |
| äöüäöüöäü          |            18 |
+--------------------+---------------+
5 rows in set (0.00 sec)

mysql> EXPLAIN Dummy;
+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| Label | varchar(10) | YES  |     | NULL    |       |
+-------+-------------+------+-----+---------+-------+
1 row in set (0.00 sec)

Your quote from the MySQL documentation at the top seems to show that you're conflating two concerns. The "2-byte length prefix" the documentation refers to is just a number stored at the beginning of every varchar column value which represents the length of the string contained within that column. For your purposes, it's not something you really need to be thinking about.

At least from my understanding of your question, the 2-byte values that you seem to concerned about would actually be Unicode characters within the text which require multiple bytes to be represented (and it is worth noting, that there are plenty of Unicode characters out there that require significantly more than two bytes as well).

As a general rule of thumb, you should consider all of your character limits in terms of actual unicode character units, rather than as e.g. byte limits --- e.g. if I have 10 unicode characters that each require 4 bytes to store, I should be at 10/127 of your character limit, not 40/127.

This is how MySQL works, assuming you're on a version > 5, and have your table configured to use UTF-8 (docs):

For definitions of character string columns (CHAR, VARCHAR, and the TEXT types), MySQL interprets length specifications in character units.

However, this is not how maxlength and minlength in HTML work --- they measure in single UTF-16 code units (so essentially two bytes per character), so if you have e.g. a large emoji they will, out-of-the-box, not count it correctly:

<p>You can't type any additional characters into this box, as
the flag emojis  use 4 code points each:</p>
<input type="text" value="

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745668203a4639244.html

相关推荐

  • UMIT:统一多模态多任务视觉

    随着深度学习的迅速发展,尤其是在医学影像分析领域的应用,越来越多的视觉-语言模型(VLMs)被广泛应用于解决复杂的健康和生物医学挑战。然而,现有研究主要集中在特定任务或单一模态上,这限制了它们在多种医学场景中的适用性和泛化能力。为了解决这

    1小时前
    00
  • 2025年最受欢迎的10款免费CRM软件大对比

    在数字化转型浪潮下,越来越多的企业开始重视客户关系管理(CRM)系统。一个高效的CRM不仅能帮助企业理清客户脉络,还能提升销售效率、优化服务体验。2025年,市场上涌现了众多优秀的免费CRM软件,本文将为大家对比10款最受欢迎的产品,助您选

    1小时前
    00
  • PyMC+AI提示词贝叶斯项目反应IRT理论Rasch分析篮球比赛官方数据:球员能力与位置层级结构研究

    全文链接:tecdat?p=41666在体育数据分析领域不断发展的当下,数据科学家们致力于挖掘数据背后的深层价值,为各行业提供更具洞察力的决策依据。近期,我们团队完成了一项极具意义的咨询项目,旨在通过先进的数据分析手段,深入探究篮球比赛中

    1小时前
    00
  • 1.8w字图解Java并发容器: CHM、ConcurrentLinkedQueue、7 种阻塞队列的使用场景和原理

    文章多图且内容硬核,建议大家收藏上一章《1.6w 字图解 Java 并发:多线程挑战、线程状态和通信、死锁;AQS、ReentrantLock、Condition 使用和原理》,我们开启了 Java 高并发系列的学习,透彻理解 Java 并

    1小时前
    00
  • 如何打造高效AI智能体?

    作者|Barry Zhang, Anthropic地址|出品|码个蛋(ID:codeegg)整理|陈宇明最近看到了 Anthropic 那篇著名的《Building effective agents》作者之一 Barry Zhang 在 2

    58分钟前
    00
  • 国产车载通信测试方案:车规级CAN SIC芯片测试技术解析

    随着智能网联汽车的快速发展,车辆内部电子控制单元(ECU)数量激增,动力总成、高级驾驶辅助系统(ADAS)、车身控制等功能对车载通信网络的稳定性与速率提出了更高要求。传统CAN FD总线在复杂拓扑中面临信号振铃、通信速率受限(实际速率通常低

    50分钟前
    00
  • OWASP TOP10

    什么是OWASP?它的全称是 Open Web Application Security Project(开放式 Web 应用程序 安全 项目)TOP 10OWASP Top 10的意思就是10项最严重的Web 应用程序安全风险列表 ,它总

    49分钟前
    00
  • Go 语言 Mock 实践

    Mock 是软件测试中的一项关键技术,尤其在单元测试领域,可谓是“顶梁柱”般的存在,几乎不可或缺。它通过模拟真实对象的行为,使我们能在不依赖外部系统的情况下,专注测试代码的核心逻辑。对于测试开发、自动化测试,乃至性能测试中的某些场景,合理使

    39分钟前
    00
  • 如何快速判断 Flutter 库是否需要适配鸿蒙?纯 Dart 库无需适配!

    在鸿蒙开发中,选择合适的 Flutter 库至关重要。纯 Dart 库因其跨平台特性,无需适配即可直接使用。但对于新手来说,如何判断一个库是否为纯 Dart 库呢?本文将为你提供清晰的判断方法和实用技巧。一、检查 pubspec.yaml

    37分钟前
    00
  • 推荐一个轻量级的监控平台并且支持移动端

    简介XUGOU 是基于Cloudflare构建的轻量化监控平台,专精于系统资源监控与可视化状态页面服务。该平台提供英文简体中文双语支持,满足全球化部署需求。面向开发者及中小团队,项目致力于提供高可用性的监控解决方案。核心功能与实现平台功能

    28分钟前
    00
  • 拥抱国产化:转转APP的鸿蒙NEXT端开发尝鲜之旅

    本文由转转技术团队赵卫兵分享,原题“鸿蒙新篇章:转转 APP 的 HarmonyOS Next 开发之旅”,下文进行了排版优化和内容修订。1、引言2023 年在华为开发者大会(HDC.Together)上,除了面向消费者的 HarmonyO

    22分钟前
    00
  • maxwell遇到的一则问题

    结论和原因maxwell的元数据库里面没有存储全部的schema数据(就是少数据了),导致相关表的DDL校验失败。PS:我这里maxwell的作用只是采集库表修改情况的统计粗粒度指标,因为之前maxwell在运行报错的时候,直接修改了pos

    20分钟前
    00
  • 人工智能与ai有什么区别

    一、引言:概念之辨的必要性在科技浪潮席卷全球的当下,人工智能(Artificial Intelligence,简称AI)已成为人们耳熟能详的词汇。然而,当我们深入探讨时,会发现“人工智能”与“AI”这两个表述在语义和使用场景上存在微妙差异。

    18分钟前
    00
  • windows新建open ai密钥

    api链接 openai的api需要付费才能使用但好像系统变量不知道为啥用不了打印出来&#xff0c;获取到的是None可以用了

    15分钟前
    00
  • 最后讲一遍:ChatGPT 快速生成国内外研究现状的方法

    在科研工作中,梳理国内外研究现状有助于明确研究方向,发现研究空白,为后续研究提供理论支持与创新思路。本文将详细介绍如何借助 ChatGPT 高效生成国内外研究现状,帮助您在有限时间内构建全面、专业的文献综述框架,提升学术写作效率与质量。St

    13分钟前
    00
  • 雨晨 22635.5170 Windows 11 企业版 23H2 轻装版

    文件: 雨晨 22635.5170 Windows 11 企业版 23H2 轻装版 install.esd 大小: 2920270404 字节 修改时间: 2025年4月8日, 星期二, 11 : 04 : 59 MD5: D5F8F0AD

    13分钟前
    00
  • 人工智能适合什么人学

    一、引言:人工智能浪潮下的新机遇在当今科技飞速发展的时代,人工智能(AI)无疑是最为耀眼的技术明星之一。从智能语音助手到自动驾驶汽车,从医疗诊断辅助到金融风险预测,人工智能正以前所未有的速度改变着我们的生活和工作方式。随着全球领先的终身学习

    12分钟前
    00
  • Windows系统密钥检测工具PIDKey 2.1中文版

    Windows系统密钥检测工具PIDKey 2.1中文版 【下载地址】Windows系统密钥检测工具PIDKey2.1中文版 Windows系统密钥检测工具PIDKey 2.1中文版是一款功能强大的工具&#xff0c;专为管理Win

    12分钟前
    00
  • 子网掩码是怎么“掩”的?用积木教你彻底搞懂!

    子网掩码是怎么“掩”的?用积木教你彻底搞懂!前言肝文不易,点个免费的赞和关注,有错误的地方请指出,看个人主页有惊喜。作者:神的孩子都在歌唱你是不是也曾被“子网掩码”这个术语搞得晕头转向?明明是学网络的第一步,却像是打开了数学世界的大门:2

    10分钟前
    00
  • 用Xshell8配置密钥登陆

    1.首先在服务端查看root.sshauthorized_keys是否存在,这是存储公钥的文件,若不存在需新建此文件 2. 打开Xshell8,选择"新建",选择"新建用户密钥生成向导" 给用户

    4分钟前
    00

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信