Exploring the Uncommon Features in EmEditor's Text Manipulation Capabilities
Exploring the Uncommon Features in EmEditor’s Text Manipulation Capabilities
Viewing 5 posts - 1 through 5 (of 5 total)
- Author
Posts - November 29, 2007 at 10:36 am #5068
jugaor
Participant
Hi, I tried several versions (5 up 7beta) and I found the next ‘bugs’, both in manual / script searches (Spanish texts):
a por (eeFindReplaceOnlyWord)
matches “creería por”, “CAMPAÑA POR”, etc. (i.e., it breaks the words at the accented vowels or “Ñ”/”ñ”)
any accented vowel (eeFindReplaceOnlyWord)
matches “diseñé”, “ENSEÑÓ”, etc. (i.e., it breaks at the “Ñ”/”ñ” the words with final accented vowels)
In manual searches (with an open document), it matches all the accented vowels inside words despite “Search Only Word” (i.e. it matches “cómprale”, “mamá”, “después”, etc.)
(?!es |son)esta(s?)(!|?)
discards the first negative subexpression (i.e., it matches “esta!” / “esta?” / “estas!” / “estas?”), despite the fact I use ‘eeFindReplaceRegExp Or eeFindReplaceOnlyWord’ options
If I simplify the expression
(?!es) esta(!|?)
(?!es)esta(!|?)
or
(?!son) estas(!|?)
(?!son)estas(!|?)
it has the same behavior. However,
(¡|¿)esta(s?)(?! es| son)
excepts the correct ones.
If you need more information, please email-me.
TIA.
jugaor
November 29, 2007 at 7:15 pm #5071
Yutaka Emura
Keymasterjugaor wrote:
Hi, I tried several versions (5 up 7beta) and I found the next ‘bugs’, both in manual / script searches (Spanish texts):a por (eeFindReplaceOnlyWord)
matches “creería por”, “CAMPAÑA POR”, etc. (i.e., it breaks the words at the accented vowels or “Ñ”/”ñ”)any accented vowel (eeFindReplaceOnlyWord)
matches “diseñé”, “ENSEÑÓ”, etc. (i.e., it breaks at the “Ñ”/”ñ” the words with final accented vowels)In manual searches (with an open document), it matches all the accented vowels inside words despite “Search Only Word” (i.e. it matches “cómprale”, “mamá”, “después”, etc.)
(?!es |son)esta(s?)(!|?)
discards the first negative subexpression (i.e., it matches “esta!” / “esta?” / “estas!” / “estas?”), despite the fact I use ‘eeFindReplaceRegExp Or eeFindReplaceOnlyWord’ optionsIf I simplify the expression
(?!es) esta(!|?)
(?!es)esta(!|?)
or
(?!son) estas(!|?)
(?!son)estas(!|?)it has the same behavior. However,
(¡|¿)esta(s?)(?! es| son)
excepts the correct ones.If you need more information, please email-me.
TIA.
jugaor
As far as your first question is concerned, EmEditor did not try to check unicode characters (character code > U+0080) in previous versions for the speed. However, I will add a routine to check some Latin character (ch >= 0x00c0 && ch <= 0x02b8) in the next beta version. This addition will not cover all the Unicode characters but still improve “whole word” accuracy in most cases while not sacrificing much speed.
I was not sure about your latter question, but there are two unnecessary spaces in your regular expression: (?!es |son)esta(s?)(!|?)
One between “s” and “|”, and the other between ‘n’ and ‘)’.
Removing these spaces does not solve your issue?
November 30, 2007 at 5:30 am #5074
jugaor
Participant
Hi, thank you very much for your response.
1. In Spanish, the ‘special’ letters are ÁÉÍÓÚÜ, áéíóúü, Ñ, ñ. I presume that these Unicode chars cover them :)
2. The spaces are needed, since they’re two whole words:
“esta” = “this” / “estas” = “these”, both feminine.
“es” = “is” (singular, verb to be)
“son” = “are” (plural, verb to be)
The strange thing is that EmEditor rightly works with the same subexpression after, not before (i.e. “(¡|¿)esta(s?)(?! es| son)” is correct).
I have been trying to use EmEditor to automatically correct words with bad orthography in subtitles files (Spanish). I wrote some complex VBEE scripts for that, and I found these issues above.
Thanks for your attention,
jugaor
PS: please, write me when the new beta is ready :)
November 30, 2007 at 8:30 pm #5077
Yutaka Emura
Keymaster
jugaor wrote:
Hi, thank you very much for your response.1. In Spanish, the ‘special’ letters are ÁÉÍÓÚÜ, áéíóúü, Ñ, ñ. I presume that these Unicode chars cover them :)
2. The spaces are needed, since they’re two whole words:
“esta” = “this” / “estas” = “these”, both feminine.
“es” = “is” (singular, verb to be)
“son” = “are” (plural, verb to be)
The strange thing is that EmEditor rightly works with the same subexpression after, not before (i.e. “(¡|¿)esta(s?)(?! es| son)” is correct).I have been trying to use EmEditor to automatically correct words with bad orthography in subtitles files (Spanish). I wrote some complex VBEE scripts for that, and I found these issues above.
Thanks for your attention,
jugaorPS: please, write me when the new beta is ready :)
(?=pattern) (positive lookahead search) and (?!pattern) (negative lookahead search) look ahead from the position where search begins.
For example, expression “(?=x)x” always matches, and expression “(?!x)x” never matches.
So it doesn’t make sense to place (?=pattern) or (?!pattern) at the beginning of a search term.
I will release beta 41 today or tomorrow.
December 1, 2007 at 8:05 am #5080
jugaor
Participant
THANK YOU VERY MUCH! I tried the 41 beta and the ‘special chars’ issue is gone! :D
Also, I saw that I misunderstood the “look ahead” expression :-?
I needed to use the “look behind” one (?<!pattern). Excuse me!
Congratulations for your excellent job!
jugaor - Author
Posts
Viewing 5 posts - 1 through 5 (of 5 total)
- You must be logged in to reply to this topic.
Also read:
- [Updated] Expertly Slice Your Videos Mac's Finest MP4 Applications for 2024
- 「Ultimate Guide: Windows 10における3種類のゴースト画像化手法」
- 2024 Approved Leading Tools to Record Your Desktop
- 修復Win10/11硬盤碎片化分區的刪除資料回收方法
- 最高のクローニング方法:サンディスクUltra 3D SSDのためのベストプラクティス
- Comprehensive FreeUndelete Software Analysis - The Ultimate Guide
- Echo Emporium A Selection of Top Sites for Skype Audio
- In 2024, 7 Insider Tips to Make Money Quickly on YouTube Shorts
- In 2024, Step-by-Step Guide to Flawless Zoom Screen Sharing
- Step-by-Step Strategies for Successful YouTube SRT Downloads
- Switching SQL Database Models: Transition From Simple to Full - A Comprehensive Guide
- Title: Exploring the Uncommon Features in EmEditor's Text Manipulation Capabilities
- Author: Scott
- Created at : 2024-10-15 16:28:58
- Updated at : 2024-10-17 16:43:24
- Link: https://win-top.techidaily.com/exploring-the-uncommon-features-in-emeditors-text-manipulation-capabilities/
- License: This work is licensed under CC BY-NC-SA 4.0.