Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stringLiteral parser of JavaTokenParsers ends up with a Stackoverflow error since scala upgrade #371

Open
srenault opened this issue Apr 15, 2021 · 4 comments

Comments

@srenault
Copy link

reproduction steps

using Scala 2.12 or 2.13

object Main {

  def main(args: Array[String]): Unit = {

    val RegScala213 = ("\""+"""([^"\x00-\x1F\x7F\\]|\\[\\'"bfnrt]|\\u[a-fA-F0-9]{4})*"""+"\"").r // Scala 2.13

    //val RegScala211 = ("\""+"""([^"\p{Cntrl}\\]|\\[\\'"bfnrt]|\\u[a-fA-F0-9]{4})*+"""+"\"").r // Scala 2.11

    val query = """"https://www.mywebsite.com/house/c/horses/302?warehouses=7fe2f2dd-c1ea-4661-acf6-e55fa1d9368a%2C72f2d9f8-d659-4f14-bb52-26d84f6979aa%2C17f49ed9-204a-42b3-8bf9-fc84e9829881%2C5dbe9433-2093-4112-a21b-4c7bcb718e3e%2C6e998dbb-3de7-4215-a7b9-859b8548993d%2Ce7e3e675-f6e0-475f-892a-f4386eef7d28%2Cf06146f4-7405-4c05-a6d9-83cebbd09951%2Cf9d770c9-a575-4020-b2ba-e9c4c4db47c7%2C36386fef-3abf-483b-a74d-b0de2974cea3%2C92707102-f711-46e3-a071-212c18f00ef3%2C44703d29-3dec-4bfd-b4ee-babdf3803348%2C96f9d86f-e487-44cd-9ae4-eb15518950ec%2C05edda44-e4ef-44ea-9838-71633643373a%2Cd70690e4-c35f-4647-9610-7d2b83fbac49%2C6b8afc90-f2a1-41f5-9532-2d6ab4d71b62%2C0626f9b9-b7e4-422c-9093-c3ae416053da%2C3e8a8728-f984-4b6a-a72a-4d354ed685b9%2C83101da9-5033-4bc6-bafb-23de37ba7fc8%2Cf8fe11a3-f1dd-44bd-8ed5-26850b1b85a6%2C314561b7-4c79-4a2c-84dd-c6b992a72c60%2Ca5094218-73f3-41bf-898a-f754afebf5fb%2Cf9ceaba7-5b53-41df-b6d4-649052b02474%2C1a8f1a4c-0598-427e-af75-2cf6cc3d23d1%2C192e0835-a51f-4618-b05c-840b50e3ccc9%2C641c6ab1-e2c0-4741-b694-d9b69a62e1d3%2Cddeaa4c6-463f-422f-bb57-e63c569dcfe6%2C93e970cf-d3c3-458d-9733-77d76dadf45b%2C694d6700-393b-438a-9f65-f4560e5f8fc8%2C6ca55c3d-3600-4d24-ac1c-5e3e1dee13fe%2Cd1e631aa-5ffc-4bd8-8032-d8bbb7e37139%2C5902c655-d859-4b84-a1d3-32c7532ea05c%2C53a635df-0ab0-48ff-baa2-970f0e3a5894%2Cfdc858c0-3436-4717-a04c-49f43692ebaa%2Cb3e8d8c0-abba-4ae2-8297-0200076fe01e%2C16c4d84b-7e44-4cac-ae90-e4a807be480b%2Ca629e7a1-e7b7-4dd8-86b2-578ec3f67bef%2Cc2f15ee4-f768-42fd-84d7-8fe03bc2e850%2C913e78ee-b019-49c6-9cdd-b0c708135c10%2C92f5b8c7-f458-4d3d-941f-453110cad138%2Cb22f1cad-2242-44a5-a97d-8094f4149e5f%2Cc2164749-f6f0-4f64-92ae-f59dd2aa29e5%2C722f2065-f32e-43a3-b706-d790d727a969%2C06e98486-1c4d-4cb4-9e70-d0ccaf40f571%2C8da3be85-4bfc-40e0-898e-cd87932801c7%2C4558f17f-1f67-49ee-9a4d-6a8ee6a18d23%2C0aa5a074-60dd-4cad-a30b-ba2c8f67f6f4%2C90baddd2-4ca2-49a1-bd4f-6845d3784574%2C87049a4e-ca40-414d-80f2-f6c566d62cd2%2C8b2b73b6-c727-4812-8cc4-12efd8f44a30%2C2c5d9b5e-6a90-489d-9306-dd2519a53da4%2C2ecbb252-71e7-4cc7-9096-d8149596ce45%2Cc27cee97-f090-4145-b8db-8bbdde30b779%2Cd4489788-33eb-4da0-9ded-2a764deb6048%2C89c0d271-4d70-43a6-8929-594b2298401d%2C0fe8b049-1e1d-4aff-be7a-7acdef611922%2Cbcc68252-e898-49f9-ac92-4d0e2f32af42%2Cc98e1b16-ed87-40ab-8b8c-86537b62b9f5%2C9296f4ea-81ed-49da-aa4f-59bbffa0848d%2C9343c619-5688-4e68-a9bb-0a1c3a84a188%2C9dd47b8a-6e75-4de3-aec3-b26cd5107cfc%2C8033b59e-23ef-4903-8bb6-0e6124aabff9%2Cca05fc7d-e5c5-42a6-b4b0-4f4bee81f517%2Cedf7dcfd-95d2-4af8-af58-a1c7c04d19f0%2C920720e5-0938-4f20-a2ba-8336dce283ab%2C13a9252b-a946-479e-ae6e-a80bc1d36d89%2Cf106f107-3f5c-4f09-992d-a4f8c754ad59%2C19de97a3-f55e-4aa8-9dfd-92be2a3366b3%2Ce28fb8a9-f362-4c3a-b3ae-b65acc2aed70%2C1f48c19a-4746-475d-a16d-3a4e8ca678a8%2C6dec5d60-e3ab-46d2-af77-04899d3d4ba7%2Cd34d11ba-5bcc-4958-8902-a2809d4841bd%2C13c76c21-64b9-4aea-917b-2b6fe93f832a%2C53a8c21c-7f10-40fc-9bfb-a7031747ba77%2C604a764a-9916-4006-b4c7-6cdd8d11de98%2C8c62146e-db90-4b4b-bb76-80b53500152f%2Cb3232b20-487e-4364-b829-8adb45463ad0%2C47b5438b-b883-4491-8a0c-664c577c0b2a%2C791341e5-67bb-4f60-98f9-1be7e161d28b%2Cbe28b8ce-d90d-4248-9da2-fee319d68641%2C3e8d191c-6226-4dcb-ac9b-ca6693b42c3d%2Cbd576b40-5b69-4d7c-b093-bfa69b90cc0d%2Ca3631cba-abe9-4394-b1e5-9291feb371df&cursor=WyIyMDIxLTA0LTA4IDA1OjU2OjE3LjA4MjUyNyswMDowMCIsICJDdWUgV29tZW5zIFNpemUgMTAgU2hpcnRzICYgQmxvdXNlcyBNYXJvb24vU3RyaXBlZCAiLCAiY3VlLXdvbWVucy1zaXplLTEwLXNoaXJ0cy1ibG91c2VzLW1hcm9vbnN0cmlwZWQiXQ%3D%3D","https://www.salvosstores.com.au/shop/c/womens/302?warehouses=7fe2f2dd-c1ea-4661-acf6-e55fa1d9368a%2C72f2d9f8-d659-4f14-bb52-26d84f6979aa%2C17f49ed9-204a-42b3-8bf9-fc84e9829881%2C5dbe9433-2093-4112-a21b-4c7bcb718e3e%2C6e998dbb-3de7-4215-a7b9-859b8548993d%2Ce7e3e675-f6e0-475f-892a-f4386eef7d28%2Cf06146f4-7405-4c05-a6d9-83cebbd09951%2Cf9d770c9-a575-4020-b2ba-e9c4c4db47c7%2C36386fef-3abf-483b-a74d-b0de2974cea3%2C92707102-f711-46e3-a071-212c18f00ef3%2C44703d29-3dec-4bfd-b4ee-babdf3803348%2C96f9d86f-e487-44cd-9ae4-eb15518950ec%2C05edda44-e4ef-44ea-9838-71633643373a%2Cd70690e4-c35f-4647-9610-7d2b83fbac49%2C6b8afc90-f2a1-41f5-9532-2d6ab4d71b62%2C0626f9b9-b7e4-422c-9093-c3ae416053da%2C3e8a8728-f984-4b6a-a72a-4d354ed685b9%2C83101da9-5033-4bc6-bafb-23de37ba7fc8%2Cf8fe11a3-f1dd-44bd-8ed5-26850b1b85a6%2C314561b7-4c79-4a2c-84dd-c6b992a72c60%2Ca5094218-73f3-41bf-898a-f754afebf5fb%2Cf9ceaba7-5b53-41df-b6d4-649052b02474%2C1a8f1a4c-0598-427e-af75-2cf6cc3d23d1%2C192e0835-a51f-4618-b05c-840b50e3ccc9%2C641c6ab1-e2c0-4741-b694-d9b69a62e1d3%2Cddeaa4c6-463f-422f-bb57-e63c569dcfe6%2C93e970cf-d3c3-458d-9733-77d76dadf45b%2C694d6700-393b-438a-9f65-f4560e5f8fc8%2C6ca55c3d-3600-4d24-ac1c-5e3e1dee13fe%2Cd1e631aa-5ffc-4bd8-8032-d8bbb7e37139%2C5902c655-d859-4b84-a1d3-32c7532ea05c%2C53a635df-0ab0-48ff-baa2-970f0e3a5894%2Cfdc858c0-3436-4717-a04c-49f43692ebaa%2Cb3e8d8c0-abba-4ae2-8297-0200076fe01e%2C16c4d84b-7e44-4cac-ae90-e4a807be480b%2Ca629e7a1-e7b7-4dd8-86b2-578ec3f67bef%2Cc2f15ee4-f768-42fd-84d7-8fe03bc2e850%2C913e78ee-b019-49c6-9cdd-b0c708135c10%2C92f5b8c7-f458-4d3d-941f-453110cad138%2Cb22f1cad-2242-44a5-a97d-8094f4149e5f%2Cc2164749-f6f0-4f64-92ae-f59dd2aa29e5%2C722f2065-f32e-43a3-b706-d790d727a969%2C06e98486-1c4d-4cb4-9e70-d0ccaf40f571%2C8da3be85-4bfc-40e0-898e-cd87932801c7%2C4558f17f-1f67-49ee-9a4d-6a8ee6a18d23%2C0aa5a074-60dd-4cad-a30b-ba2c8f67f6f4%2C90baddd2-4ca2-49a1-bd4f-6845d3784574%2C87049a4e-ca40-414d-80f2-f6c566d62cd2%2C8b2b73b6-c727-4812-8cc4-12efd8f44a30%2C2c5d9b5e-6a90-489d-9306-dd2519a53da4%2C2ecbb252-71e7-4cc7-9096-d8149596ce45%2Cc27cee97-f090-4145-b8db-8bbdde30b779%2Cd4489788-33eb-4da0-9ded-2a764deb6048%2C89c0d271-4d70-43a6-8929-594b2298401d%2C0fe8b049-1e1d-4aff-be7a-7acdef611922%2Cbcc68252-e898-49f9-ac92-4d0e2f32af42%2Cc98e1b16-ed87-40ab-8b8c-86537b62b9f5%2C9296f4ea-81ed-49da-aa4f-59bbffa0848d%2C9343c619-5688-4e68-a9bb-0a1c3a84a188%2C9dd47b8a-6e75-4de3-aec3-b26cd5107cfc%2C8033b59e-23ef-4903-8bb6-0e6124aabff9%2Cca05fc7d-e5c5-42a6-b4b0-4f4bee81f517%2Cedf7dcfd-95d2-4af8-af58-a1c7c04d19f0%2C920720e5-0938-4f20-a2ba-8336dce283ab%2C13a9252b-a946-479e-ae6e-a80bc1d36d89%2Cf106f107-3f5c-4f09-992d-a4f8c754ad59%2C19de97a3-f55e-4aa8-9dfd-92be2a3366b3%2Ce28fb8a9-f362-4c3a-b3ae-b65acc2aed70%2C1f48c19a-4746-475d-a16d-3a4e8ca678a8%2C6dec5d60-e3ab-46d2-af77-04899d3d4ba7%2Cd34d11ba-5bcc-4958-8902-a2809d4841bd%2C13c76c21-64b9-4aea-917b-2b6fe93f832a%2C53a8c21c-7f10-40fc-9bfb-a7031747ba77%2C604a764a-9916-4006-b4c7-6cdd8d11de98%2C8c62146e-db90-4b4b-bb76-80b53500152f%2Cb3232b20-487e-4364-b829-8adb45463ad0%2C47b5438b-b883-4491-8a0c-664c577c0b2a%2C791341e5-67bb-4f60-98f9-1be7e161d28b%2Cbe28b8ce-d90d-4248-9da2-fee319d68641%2C3e8d191c-6226-4dcb-ac9b-ca6693b42c3d%2Cbd576b40-5b69-4d7c-b093-bfa69b90cc0d%2Ca3631cba-abe9-4394-b1e5-9291feb371df&cursor=WyIyMDIxLTA0LTA4IDA1OjU2OjE3LjA4MjUyNyswMDowMCIsICJDdWUgV29tZW5zIFNpemUgMTAgU2hpcnRzICYgQmxvdXNlcyBNYXJvb24vU3RyaXBlZCAiLCAiY3VlLXdvbWVucy1zaXplLTEwLXNoaXJ0cy1ibG91c2VzLW1hcm9vbnN0cmlwZWQiXQ%3D%3D""""

    RegScala213.findAllMatchIn(query).toSeq
  }
}

problem

In the process of upgrading scala from 2.11 to 2.13, we are now hitting a StackOverflow error when using the scala parser combinator with the exact same code.
After investigating that issue, it turns out this is related to a change made on the stringLiteral parser regex.

In scala 2.11, the regex was ("\""+"""([^"\p{Cntrl}\\]|\\[\\'"bfnrt]|\\u[a-fA-F0-9]{4})*+"""+"\"").r and now it's ("\""+"""([^"\x00-\x1F\x7F\\]|\\[\\'"bfnrt]|\\u[a-fA-F0-9]{4})*"""+"\"").r.

The commit that changed the regex is 98737a2#diff-bce6263cece0f67933d12d2f709294696de6c553a335a8123819f0f97aa6587bL52.

The PR description states:
We also removed an invalid (and useless) + at the end of the regexp.

I'm not sure to get why the + is invalid. Can someone give me more details?

@SethTisue SethTisue transferred this issue from scala/bug Apr 15, 2021
@SethTisue
Copy link
Member

@sjrd you remember anything about that?

@som-snytt
Copy link
Contributor

som-snytt commented Apr 15, 2021

The "possessive quantifier" is necessary in that case on the JVM:

scala> val longish = "X" * 3000
val longish: String = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX...

scala> val r = raw"""(X|Y)*+""".r
val r: scala.util.matching.Regex = (X|Y)*+

scala> r.findAllMatchIn(longish).map(m => m.group(0).length()).toList
val res4: List[Int] = List(3000, 0)

scala> val r = raw"""(X|Y)*""".r
val r: scala.util.matching.Regex = (X|Y)*

scala> r.findAllMatchIn(longish)
java.lang.StackOverflowError
  at java.base/java.util.regex.Pattern$GroupTail.match(Pattern.java:4832)

Your stack may vary.

Your regex library may different on other platforms; the linked PR is for supporting JS. ("Java regular expressions support possessive quantifiers, but JavaScript does not," says an undergrad web page.)

https://next.sonarqube.com/sonarqube/coding_rules?open=java%3AS5998&rule_key=java%3AS5998

https://docs.oracle.com/javase/tutorial/essential/regex/quant.html

@som-snytt
Copy link
Contributor

A good interview question would be, "So are you greedy, reluctant, or possessive?"

@SethTisue SethTisue closed this as not planned Won't fix, can't repro, duplicate, stale Jan 10, 2023
@SethTisue SethTisue reopened this Jan 10, 2023
@SethTisue
Copy link
Member

SethTisue commented Jan 10, 2023

Has anyone tried to determine whether it's possible here for us to both be JS-friendly, and avoid consuming so much stack on the JVM?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants