this post was submitted on 25 Jun 2025
29 points (79.6% liked)

Today I Learned

23149 readers
277 users here now

What did you learn today? Share it with us!

We learn something new every day. This is a community dedicated to informing each other and helping to spread knowledge.

The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:

Rules (interactive)


Rule 1- All posts must begin with TIL. Linking to a source of info is optional, but highly recommended as it helps to spark discussion.

** Posts must be about an actual fact that you have learned, but it doesn't matter if you learned it today. See Rule 6 for all exceptions.**



Rule 2- Your post subject cannot be illegal or NSFW material.

Your post subject cannot be illegal or NSFW material. You will be warned first, banned second.



Rule 3- Do not seek mental, medical and professional help here.

Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.



Rule 4- No self promotion or upvote-farming of any kind.

That's it.



Rule 5- No baiting or sealioning or promoting an agenda.

Posts and comments which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.



Rule 6- Regarding non-TIL posts.

Provided it is about the community itself, you may post non-TIL posts using the [META] tag on your post title.



Rule 7- You can't harass or disturb other members.

If you vocally harass or discriminate against any individual member, you will be removed.

Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.

For further explanation, clarification and feedback about this rule, you may follow this link.



Rule 8- All comments should try to stay relevant to their parent content.



Rule 9- Reposts from other platforms are not allowed.

Let everyone have their own content.



Rule 10- Majority of bots aren't allowed to participate here.

Unless included in our Whitelist for Bots, your bot will not be allowed to participate in this community. To have your bot whitelisted, please contact the moderators for a short review.



Partnered Communities

You can view our partnered communities list by following this link. To partner with our community and be included, you are free to message the moderators or comment on a pinned post.

Community Moderation

For inquiry on becoming a moderator of this community, you may comment on the pinned post of the time, or simply shoot a message to the current moderators.

founded 2 years ago
MODERATORS
 

I tried to enter my name "哎满" and it didn't work. I asked in libera chat and they said you can't enter non-ascii chars in IRC, only few IRC instances supports it since it could be easily abused.

Erroneous Nickname: 哎满

top 5 comments
sorted by: hot top controversial new old
[–] tal@lemmy.today 12 points 6 days ago* (last edited 6 days ago) (1 children)

Unicode has a lot of "lookalike" characters, so if you're allowed to select characters as a unique identifier to other users, permitting selection of arbitrary Unicode characters opens the possibility to impersonate users.

I believe that there is some system for dealing with this for domain names, as they permit for Unicode and being able to uniquely identify domains is important. I don't know if this could be generalized to other Unicode-using applications.

[–] Ephera@lemmy.ml 9 points 6 days ago (1 children)

The system for domain names is called Punycode: https://en.wikipedia.org/wiki/Punycode

But it's still combined with domain registrars rejecting names like "αpple.com", which ultimately needs a human to approve names.

There could also be a system like here on Lemmy, where there's a separate display name, but it still doesn't really solve the impersonation problem...

[–] Tanoh@lemmy.world 6 points 5 days ago (1 children)

Some TLDs don't allow full unicode either. Country TLDs usually just add their own special chars, for example .se (sweden) allows åäö.

The whole thing has a name as well: https://en.wikipedia.org/wiki/IDN_homograph_attack

[–] tal@lemmy.today 4 points 5 days ago* (last edited 5 days ago)

I'd also add that ASCII has had some similar issues in the part, but that tends to have been ironed out by now via changes to onscreen typefaces.

For example, some old typewriters don't have a "0" key or a "1" key because capital-o and lowercase-l looked similar enough and context was sufficient to let them be used in place of the corresponding number. This trained some people to do that, to the point that various software adapted to permit misuse of one in the place of the other. To this day, I can open up Firefox, and the following webpage will render green text:

<html><font color="#OOFFOO">green text
</font></html>

Some other fixes were were made over time, like making capital-i, lowercase-l, and the pipe ("I", "l", and "|") as more-visually-distinct characters in typefaces where this matters.

In the monospaced font world, "programming" or "coding" fonts, where not confusing the character in question is particularly important, place a premium on keeping characters like this particularly distinctive, even at the cost of trading off some aesthetic appeal or conforming to traditional typography or handwriting-like conventions for letters. You'll get more-distinctive "." and ",", "O" and "0", "l", "I", and "|", "j" and "i", etc.

[–] stoy@lemmy.zip 5 points 6 days ago

A slight correction on IRC terminology:

There are no IRC instances, if you are talking about Quakenet or EFnet, then you are talking about IRC networks, which consists of several servers.