[Demo] Searching with regular expressions

Demos, code samples. Only questions related to the existing topics are allowed here.
Post Reply
Sergey Tkachenko
Site Admin
Posts: 17253
Joined: Sat Aug 27, 2005 10:28 am
Contact:

[Demo] Searching with regular expressions

Post by Sergey Tkachenko »

Searching with regular expressions
RVRegEx.zip
(177.49 KiB) Downloaded 3676 times
This ZIP file contains two demo projects.

1) Search from the cursor position
TRichView-Regular-Expressions.png
TRichView-Regular-Expressions.png (65.25 KiB) Viewed 76479 times
2) Search and highlight all occurrences
TRichView-Regular-Expressions-Ex.jpg
TRichView-Regular-Expressions-Ex.jpg (185.9 KiB) Viewed 76479 times
The both demo projects use the same technique. They save content to a sting that has 1:1 correspondence to the original content. Regular expressions are searched in this string using TRegEx. After finding, they are marked in the original document.

This demo requires Delphi XE or newer (because TRegEx was introduced in this version of Delphi).
The first version of this demo is very fast even for huge documents.
The second version may be slow, because TRegEx.Matches is slow if it returns a large count of matches (in real application, you can use TRegEx.Match and then a cycle of Match.NextMatch, and limit a count of results).
For results highlighting, the second demo uses the same mechanism as for live spelling checking (instead of wavy underlines, it draws semitransparent color rectangles).
jgkoehn
Posts: 288
Joined: Thu Feb 20, 2020 9:32 pm

Re: [Demo] Searching with regular expressions

Post by jgkoehn »

Greetings Sergey,
I am working on converting the first Regex into Lazarus into an app. However, as it goes it gets further and further out of sync with the selection. I tried 2,1,0,-1 for the Caretposition and it doesn't seem to matter. Thoughts?
jgkoehn
Posts: 288
Joined: Thu Feb 20, 2020 9:32 pm

Re: [Demo] Searching with regular expressions

Post by jgkoehn »

I found I needed to adjust some code as I am using PCRE code that was originally Delphi in Lazarus so it is not fully unicode directly.
I believe this is the problem because as more Unicode is introduced it gets more and more out of align with RVGetLinearCaretPos.
My guess is because of how Unicode can be more in this M.Index in the following snippet of code.

Code: Select all

function TForm3.Select(m: IMatch): UnicodeString;
var
  StartIndex, EndIndex, ItemNo1, Offs1, ItemNo2, Offs2: Integer;
  RVData1, RVData2: TCustomRVData;
begin

  //StartIndex := M.Index - 1;
  StartIndex := M.Index;
  EndIndex := StartIndex + M.Length;
jgkoehn
Posts: 288
Joined: Thu Feb 20, 2020 9:32 pm

Re: [Demo] Searching with regular expressions

Post by jgkoehn »

I see if I use RVGetText and FText := GetAllText(RichViewEdit1);
It works better for Lazarus and PCRE but then graphics throw it off hmms. I will have to dig more.
[Edit] Well that only sorta worked. Hmms more.
Sergey Tkachenko
Site Admin
Posts: 17253
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Re: [Demo] Searching with regular expressions

Post by Sergey Tkachenko »

Do not use the functions from RVGetText/RVGetTextW. Only the functions from RVLinear provide one-to-one correspondence with the document.
What regexp library do you use?
jgkoehn
Posts: 288
Joined: Thu Feb 20, 2020 9:32 pm

Re: [Demo] Searching with regular expressions

Post by jgkoehn »

Greetings,
Thanks for the information on the GetText
This is the regex library that is used with some modifications to make it Lazarus ready by another user.
http://renatomancuso.com/software/dpcre/dpcre.htm
Here is the modified:
https://github.com/rubiot/ibiblia/blob/ ... re_dll.pas and also the PCRE.pas in the same location. Also
TRegex.Replace has a comment in it if you use it that needs fixed.
It should not say Result := Input;
jgkoehn
Posts: 288
Joined: Thu Feb 20, 2020 9:32 pm

Re: [Demo] Searching with regular expressions

Post by jgkoehn »

Any further thoughts on this Sergey?
Sergey Tkachenko
Site Admin
Posts: 17253
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Re: [Demo] Searching with regular expressions

Post by Sergey Tkachenko »

I converted the first demo to Lazarus:
https://www.trichview.com/support/files/RegExLaz.zip

This conversion is one-to-one, with classes/records mapped to interfaces. I hope I understand it correctly (I assumed that IMatch contains one position defined by GetIndex and GetLength; I do not understand the purpose of its Groups property).

As I understand, this library can work with Unicode represented as UTF-8 (which is the default string encoding in Lazarus). So Edit1.Text can be passed as it is, but the result of RVGetTextRange must be converted using UTF8Encode.
rcoUTF8 must be included in the options.
jgkoehn
Posts: 288
Joined: Thu Feb 20, 2020 9:32 pm

Re: [Demo] Searching with regular expressions

Post by jgkoehn »

Thank you so much I will check this out
jgkoehn
Posts: 288
Joined: Thu Feb 20, 2020 9:32 pm

Re: [Demo] Searching with regular expressions

Post by jgkoehn »

Could you test your results on the attached test.rvf
I am searching for a and it does fine up until the greek unicode. If it works for you I definitely have something wrong on my end.
Attachments
test.rvf
(1.85 KiB) Downloaded 1813 times
jgkoehn
Posts: 288
Joined: Thu Feb 20, 2020 9:32 pm

Re: [Demo] Searching with regular expressions

Post by jgkoehn »

I emailed the link to the app
Sergey Tkachenko
Site Admin
Posts: 17253
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Re: [Demo] Searching with regular expressions

Post by Sergey Tkachenko »

Well, there is a bug in this example: Lazarus regex works with character positions in UTF-8, while the demo works with character positions in UTF-16.
It needs functions that will convert positions in UTF-8 to positions in UTF-16 and vice versa.
I'll make them tomorrow.
Sergey Tkachenko
Site Admin
Posts: 17253
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Re: [Demo] Searching with regular expressions

Post by Sergey Tkachenko »

I uploaded a new version for Lazarus in the same location:
https://www.trichview.com/support/files/RegExLaz.zip

Now it recalculates indexes from UTF-8 to UTF-16 and back when necessary.
Also, this DLL uses 0-based indexes in string while Delphi uses 1-based indexes; this difference was not completely handled in the previous version of this demo.
jgkoehn
Posts: 288
Joined: Thu Feb 20, 2020 9:32 pm

Re: [Demo] Searching with regular expressions

Post by jgkoehn »

Wow, thanks Sergey, this looks complicated. thanks sir! it works
Post Reply