Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved behaviour of LEFT (instruction) #1061

Open
poppichicken opened this issue Dec 2, 2024 · 8 comments
Open

Improved behaviour of LEFT (instruction) #1061

poppichicken opened this issue Dec 2, 2024 · 8 comments
Labels
-done- Done beta Candidate for BETA enhancement New feature or request
Milestone

Comments

@poppichicken
Copy link

poppichicken commented Dec 2, 2024

hi Marco.

The LEFT instruction and function descriptions seem to have got mixed up, possible from copy/pasting to the incorrect pages.

On page 570:

image

The description of the function seems to change and become the description for the LEFT (instruction).

And on page 571:

image

The description of the function seems to change and become the description for the LEFT (function).
As a result of this, it isn't clear how to use the LEFT (instruction).

@spotlessmind1975 spotlessmind1975 added bug Something isn't working documentation Improvements or additions to documentation labels Dec 2, 2024
@spotlessmind1975 spotlessmind1975 moved this from Needs triage to High priority in Bug fix / Correzione bug Dec 2, 2024
@spotlessmind1975 spotlessmind1975 added this to the future milestone Dec 2, 2024
@spotlessmind1975 spotlessmind1975 added the -fixed- Bug fixed label Dec 2, 2024
@spotlessmind1975
Copy link
Owner

Hi @poppichicken and thank you for your kind bug report!

First of all, I apologize for the confusion, caused by a typo. Indeed, the LEFT function and the LEFT statement are very similar, but they behave differently. In particular, the LEFT function takes care of extracting a substring from the left part of a given string (so: the variable passed does not undergo variations). On the contrary, the LEFT statement modifies the variable passed, which therefore undergoes variations.

To clarify better, let's see this example:

 a$ = "The Great Gig In The Sky"
 b$ = LEFT( a$, 3 )
 LEFT( a$, 3 ) = "   "

The LEFT function will return "The", when you pass it the string "The Great Gig In The Sky", so this will be the value for b$. The LEFT statement will replace the three characters with three spaces when you call the LEFT statement, resulting in the string a$ = " Great Gig In The Sky".

The correction in the manual has been made, and will be made available with the next USER MANUAL revision.

Thank you again!

@poppichicken
Copy link
Author

poppichicken commented Dec 4, 2024

Thanks Marco.
That works well.

However, the manual still contains mixed up information.

In the below, we have this sentence:
"Make sure both string$ and substring$ are declared as strings"

image

This doesn't quite seem to be consistent with what the LEFT$ function is doing, unless you mean that x is the substring, and "TEST" is the string in the example below.

image

And in the manual for LEFT (instruction), the paragraph about the second parameter seems to be talking about how the LEFT (function) works, rather than the LEFT (instruction).

image

@spotlessmind1975
Copy link
Owner

Hi @poppichicken , thank you for the feedback!

However, the manual still contains mixed up information.

Yes, you are absolutely right, that sentence on LEFT (function) really slipped my mind. I have reworded the command description better, which, barring any errors or omissions, will be as follows:

bug1061

And in the manual for LEFT (instruction), the paragraph about the second parameter seems to be talking about how the LEFT (function) works, rather than the LEFT (instruction).

Ok, in this case we are talking about a shortcoming of mine. In the sense that this functionality of replacing the (left part of a) string with another string took on very interesting nuances when it came to borderline situations, such as those in which the string to be replaced is longer than the indicated length, or other things of the sort. There is a topic on the forum that delves into the consequences, and the opinions of the developers.

Honestly, and if you agree, we should draw up some sort of specification of operation. For example, these could be the edge cases and default behaviours without pragmas (let pos = position, len = LEN(string), exp = LEN(expression) ):

  • IF pos = 0, this is equivalent to not performing any substitutions.
  • IF pos >= len, pos will be replaced by exp;
  • IF pos >= exp, pos will be replaced by exp;
  • IF ( pos < len ) AND ( pos >= exp ), pos will be replaced by exp;
  • IF ( pos < len ) AND ( pos < exp ), pos will be used "as is", replacing only the first characters with the first characters of expression.

I will re-implement it following this guide line and, since it changes the behaviour a bit, I will implement it on the beta branch using this ticket. For this reason, I turn the ticket into an improvement.

@spotlessmind1975 spotlessmind1975 added enhancement New feature or request and removed bug Something isn't working -fixed- Bug fixed labels Dec 4, 2024
@poppichicken
Copy link
Author

poppichicken commented Dec 5, 2024

Hmmm... it all seems very complicated.

Assuming len = length of original string, and exp = length of replacement expression...

  • IF pos >= len, pos will be replaced by exp;

If pos >= length of original string, it doesn't make any sense to me to perform any sort of insert or replace, except maybe to append the expression string to the end of the original string?

  • IF pos >= exp, pos will be replaced by exp;

This is confusing, if my assumptions about what LEN and EXP refer to are correct.
I'm not sure that pos needs to be compared to the length of the replacement expression.

For example, if we have:
LEFT("mouse",2)="xx"
it becomes "mxxse", if I have understood the command correctly.

But if we have:
LEFT("mouse",2)="xxx"
it becomes "mxxxe".

Here, the length of the replacement expression is indeed >= pos, but pos should not be changed to 3.
It should remain at 2.

I also find the last 2 cases confusing.

  • IF ( pos < len ) AND ( pos >= exp ), pos will be replaced by exp;
  • IF ( pos < len ) AND ( pos < exp ), pos will be used "as is", replacing only the first characters with the first characters of expression.

Similar to my examples above, it doesn't seem correct to compare pos to exp.

I understand what you are trying to do.
If the length of the replacement expression is 3 characters, you are trying to replace 3 characters in the original string, starting at pos.

Would it make more sense just to have a REPLACE command?
One that would allow you to replace any occurrences of a substring with another expression?
Though i can see problems with that too, and it doesn't really fit what you are trying to achieve.

For example, if we have "abc123abc123", and we want to replace only the first "abc" with "xyz", a traditional REPLACE command would likely replace both "abc" with "xyz", creating "xyz123xyz123".

If your desire is to replace a number of characters (determined by LEN(expression)) at pos, perhaps it's ok to write any number of characters into the original string at pos, overwriting what was there. But maybe limiting the replacement so that it doesn't make the original string longer.

LEFT("musical",3)="elephant"

would then ideally create "mueleph", where the length of the original string is maintained, but as much as possible of the replacement expression overwrites the original string from position pos.

If this is indeed your desired result, perhaps the maximum length of the "writable" part of the original string can be used to grab the leftmost number of characters from the replacement expression, and put those in at pos.

So, in "musical", from position 3, we have a maximum of 5 characters that can be written.
If the replacement expression is <= 5 characters long, no problem.
But if it is > 5 characters long, limit it to 5 characters, and overwrite "musical" from position 3 with the new 5 characters.

Also, if pos=0 OR pos > length of original string, then nothing will be done.
Otherwise pos is somewhere inside the original string, and then the replacement string can be limited to the amount of space left.
(If pos = length of original string, it means only the last character can be overwritten, so only the 1st character of the replacement string is needed).

I hope that made sense.
:)

@spotlessmind1975
Copy link
Owner

Hi @poppichicken , thank you for the feedback!

If pos >= length of original string, it doesn't make any sense to me to perform any sort of insert or replace, except maybe to append the expression string to the end of the original string?

Take this example:
image
In this case, we are using the LEFT function to extract two characters from the string. If instead of two characters we had used, for example, twenty characters, the result would have been the extraction of the entire string. I believe this is acceptable and predictable behavior.

Now, we take this example:
image

As an analogy, the result seems valid to me. In a nutshell, if pos >= lenght it means that the original string must be considered in its entireness, untouched, but replacement should be done anyway, since LEFT assignment operation has ben requested.

This is confusing, if my assumptions about what LEN and EXP refer to are correct. I'm not sure that pos needs to be compared to the length of the replacement expression.

Take this example:
image
The original string would contain enough characters to replace. However, the string to replace it with has fewer. Since the expression string is what "drives" the replacement process, it is clear that at least the first two characters can be changed, but not the third, because the expression string is not long enough. So if pos >= exp it means pos = exp (provided, of course, that the previous inequality is valid).

LEFT("mouse",2)="xx"
it becomes "mxxse", if I have understood the command correctly.

It should become "xxuse".

LEFT("mouse",2)="xxx"
it becomes "mxxxe".

It should become "xxxse".

Similar to my examples above, it doesn't seem correct to compare pos to exp.

Let me know if the graphic example works for you.

Would it make more sense just to have a REPLACE command?

The BASIC version of REPLACE should be the MID command, which historically deals with replacing an arbitrary length within a string, starting from an arbitrary position. Even then, however, the behaviors at the extremes should be clarified, like LEFT here.

One that would allow you to replace any occurrences of a substring with another expression?

Granted that a similar command might be useful, what should it do if the length of the string to be searched for is different from the length of the string to be replaced? This is a dilemma of all algorithms that deal with strings. I found myself in this impasse precisely because the concept of "search and replace" is a much more complex concept than it seems. BASIC solves the problem by using a combined arrangement of INSTR, LEFT, RIGHT, and the string sum operator (+). If one wanted to implement a REPLACE, it would have to be expressed in terms of these primitives.

LEFT("musical",3)="elephant"

In this case, we should obatin "eleical".

I hope that made sense.

I hope too! :)

Thank you again!

@poppichicken
Copy link
Author

poppichicken commented Dec 6, 2024

Ah yes, that all makes sense.
In my explanation, I forgot that you are simply trying to replace n characters at the start of a string.
My apologies once again.

I have been thinking more about this command.

So we have
LEFT(a$,n)=b$

I think these are the rules that need to be considered:
(A) if a$="" then do nothing
(B) if n=0 then do nothing
(C) if b$="" then do nothing
(D) if len(a$)<n then n=len(a$)
(E) if len(a$)<len(b$) then b$=left(b$,len(a$))
(F) if n<>len(b$) then n=len(b$)

However... if rule (F) is valid, then there is no need for n.
Which means the command can simply be:
LEFT(a$)=b$

So only rules A,C,E are needed, and b$ will simply overwrite characters of a$ (from the start of a$), based on the length of b$.

[EDIT] Actually, now that I think about it, I can see why you would want to keep the n parameter.

a$="ugBASIC"
LEFT(a$,2)="x"
PRINT a$

This replaces the first 2 characters with "x", which could certainly prove useful.

image

@spotlessmind1975
Copy link
Owner

Hi @poppichicken , and thank you for your feedback!

This last comment of yours, actually, makes me think that the directive whether the MID should be substitutive or insertive also applies to LEFT and RIGHT. So, in my opinion, we should imagine something like that for this command as well.

Coming to the rules:

(A) if a$="" then do nothing

I agree with this point, in fact it is already the current behavior of ugBASIC.

(B) if n=0 then do nothing

I agree with this rule too, and it is also the current behavior of ugBASIC.

(C) if b$="" then do nothing

Exactly, as above (it is the current behavior).

(D) if len(a$)<n then n=len(a$)

I confirm, it is the current behavior.

(E) if len(a$)<len(b$) then b$=left(b$,len(a$))

Exactly, as above (it is the current behavior).

(F) if n<>len(b$) then n=len(b$)

No, this rule is not correct, IMHO: in fact, if n<>LEN(b$), the smaller of the two will "win". This is why n is necessary, and this is to answer your following observation.

Thank you again!

@poppichicken
Copy link
Author

I understand, thanks Marco.

@spotlessmind1975 spotlessmind1975 added -fixed- Bug fixed and removed documentation Improvements or additions to documentation -fixed- Bug fixed labels Dec 8, 2024
@spotlessmind1975 spotlessmind1975 changed the title LEFT (instruction) vs LEFT (function) manual problem Improved behaviour of LEFT (instruction) Dec 8, 2024
@spotlessmind1975 spotlessmind1975 added the beta Candidate for BETA label Dec 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-done- Done beta Candidate for BETA enhancement New feature or request
Projects
Status: In progress
Development

No branches or pull requests

2 participants