
JUnit: P2J11Test.java
Strings are a stark random access data structure that stores linear text, the universal and portable
representation of all information. Under the hood, strings are character arrays, but the primitive
nature of such arrays is hidden behind a better interface of public methods. Surprising algorithms
have been developed over the years to quickly search and analyze both the string itself and the
semantic information encoded in it. Even the seemingly simple task of searching for a pattern inside
the given text can be optimized in countless ways, let alone higher level tasks on strings.
Sometimes the string data structure can be augmented with additional data that speeds up
operations that would be inefficient when given only the raw and unadorned text string itself. Even
though this data in principle adds nothing new to the mix in that its structure is fully determined by
the original string itself, having that information available as yet another form of space-time
tradeoff can provide significant speedups.
Computed from the given text string with n characters, its suffix array is an n-element array of
integers that contains the position indices 0, …, n-1 to the text. Since each position is stored in the
suffix array exactly once, the suffix array is automatically some permutation of order n. The suffix
array lists these n positions sorted in the lexicographic order (also called the “dictionary order”) of
the suffixes that start from each position. Note that the presentation on the linked Wikipedia page
on suffix arrays uses one-based position indexing to the string, whereas here we naturally use Java’s
zero-based indexing to remain consistent to our principles.
For example, the non-empty suffixes of the string "hello" are "hello", "ello", "llo", "lo"
and "o", corresponding to positions ranging from 0 to 4 in the original string. Sorting these
positions to the lexicographic order of the corresponding suffixes gives these positions in the order
[1, 0, 2, 3, 4], which then becomes the suffix array of the original string. Since all n suffixes
of text have a different length, the possibility of equality in their lexicographic comparison can
never arise. This guarantees the uniqueness of the suffix array for the given text.
For reasons of simplicity and convenience, suffix arrays are represented in this lab as instances of
some subtype of List<Integer>, despite the use of the standard technical term of “suffix array”.
Create a new class named P2J11, and there first the method
public static List<Integer> buildSuffixArray(String text)
that builds and returns the suffix array for the given text. In this lab, this can be done with the
naive
of this algorithm, a string whose all characters are the same, is never realized. The easiest way to
implement this naive algorithm in Java should be to define yet another custom subtype of
Comparator<Integer> whose method compareTo lexicographically compares the two
substrings that start from the positions that it receives as parameters. In this method, first define
the local variable ArrayList<Integer> result, and fill it up the brim with the integers from 0
to n-1. Then, just use the utility method Collections.sort to sort this result list with your
custom Comparator<Integer>. This discipline should allow your buildSuffixArray
method to be reasonably fast even when building the suffix array for the entire War and Peace, as
performed by the JUnit test class using the warandpeace.txt data file.
Once the suffix array has been constructed for the given fixed text, it can be used to rapidly find all
positions inside text where the given pattern occurs. These would be precisely the very same
positions whose suffixes start with that pattern! Having already done the work of sorting these
suffixes in lexicographic order in the previous preprocessing step allows us to use a slightly
modified binary search performed to the suffix array to find the lexicographically first such suffix.
Since all suffixes that start with pattern must necessarily follow each other as a bunch in the
sorted array of such suffixes, looping through all these positions is a straightforward while-loop
once the binary search has determined the first such position.
To achieve all that, write the second method
public static List<Integer> find(String pattern, String text,
List<Integer> suffix) that creates and returns a list of positions of the original text that contain the given pattern, using the given suffix array to speed up this search. Note that this returned list of positions must be sorted in ascending order of the positions themselves, instead of the lexicographic order of the suffixes of text that start at these positions.

Trending nowThis is a popular solution!
Step by stepSolved in 2 steps with 2 images

When testing, I get the following error, Can you please help
===================================================
java.lang.OutOfMemoryError: Java heap space
at java.base/java.util.Arrays.copyOfRange(Arrays.java:3822)
at java.base/java.lang.StringLatin1.newString(StringLatin1.java:769)
at java.base/java.lang.String.substring(String.java:2709)
at java.base/java.lang.String.substring(String.java:2677)
at P2J11.buildSuffixArray(P2J11.java:9)
at P2J11Test.testUsingWarAndPeaceAsData(P2J11Test.java:69)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
at com.intellij.rt.junit.IdeaTestRunner$Repeater$1.execute(IdeaTestRunner.java:38)
at com.intellij.rt.execution.junit.TestsRepeater.repeat(TestsRepeater.java:11)
at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:35)
at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:235)
=============================================================
Junit test
@Test public void testUsingWarAndPeaceAsData() {
// Construct a version of War and Peace with everything in lowercase,
// with newlines converted into single whitespaces.
StringBuilder sb = new StringBuilder();
try(Scanner scan = new Scanner(new File("warandpeace.txt"))) {
while(scan.hasNext()) {
String line = scan.next();
sb.append(line.toLowerCase());
sb.append(" ");
}
}
catch(Exception e) {
System.out.println("Unable to read file warandpeace.txt.");
fail();
}
String text = sb.toString();
List<Integer> suffix = P2J11.buildSuffixArray(text);
String[] pats = {
"hairpin", "dearest", "chicken", "germany", "soup",
"when the butler with a bottle"
};
List<List<Integer>> expected = Arrays.asList(
// hairpin
Collections.emptyList(),
// dearest
Arrays.asList(401455, 589251, 673175, 756728, 762806, 824448, 824892,
876326, 1874525, 2097431, 2824422, 2824638, 3001502, 3069811, 3070789),
// chicken
Arrays.asList(1000200, 1322792, 1323345, 1709728, 1858789, 2112805),
// germany
Arrays.asList(149169, 1625813, 2387228, 2621602),
// soup
Arrays.asList(147783, 546991, 772564, 1954975, 2370800, 2534921,
2667437, 2751268, 3010037, 3010169),
// when the butler with a bottle
Collections.singletonList(149207)
);
for(int i = 0; i < pats.length; i++) {
String pat = pats[i];
List<Integer> expect = expected.get(i);
List<Integer> find = P2J11.find(pat, text, suffix);
for(int pos: find) {
assertTrue(text.substring(pos).startsWith(pat));
}
assertEquals(expect, find);
}
CRC32 check = new CRC32();
for(int i = 0; i < text.length(); i++) {
check.update(text.charAt(i));
}
Random rng = new Random(12345);
for(int i = 0; i < 100; i++) {
int pos = rng.nextInt(1000000);
int len = rng.nextInt(20) + 5;
String pat = text.substring(pos, pos + len);
List<Integer> find = P2J11.find(pat, text, suffix);
//System.out.println("<" + pat + ">: " + find.size());
for(int j: find) { check.update(j); }
}
assertEquals(3893756230L, check.getValue());
}
==========================================================
Warandpeace.txt sample
"Well, Prince, so Genoa and Lucca are now just family estates of the
Buonapartes. But I warn you, if you don't tell me that this means war,
if you still try to defend the infamies and horrors perpetrated by that
Antichrist--I really believe he is Antichrist--I will have nothing more
to do with you and you are no longer my friend,
When testing, I get the following error, Can you please help
===================================================
java.lang.OutOfMemoryError: Java heap space
at java.base/java.util.Arrays.copyOfRange(Arrays.java:3822)
at java.base/java.lang.StringLatin1.newString(StringLatin1.java:769)
at java.base/java.lang.String.substring(String.java:2709)
at java.base/java.lang.String.substring(String.java:2677)
at P2J11.buildSuffixArray(P2J11.java:9)
at P2J11Test.testUsingWarAndPeaceAsData(P2J11Test.java:69)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
at com.intellij.rt.junit.IdeaTestRunner$Repeater$1.execute(IdeaTestRunner.java:38)
at com.intellij.rt.execution.junit.TestsRepeater.repeat(TestsRepeater.java:11)
at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:35)
at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:235)
=============================================================
Junit test
@Test public void testUsingWarAndPeaceAsData() {
// Construct a version of War and Peace with everything in lowercase,
// with newlines converted into single whitespaces.
StringBuilder sb = new StringBuilder();
try(Scanner scan = new Scanner(new File("warandpeace.txt"))) {
while(scan.hasNext()) {
String line = scan.next();
sb.append(line.toLowerCase());
sb.append(" ");
}
}
catch(Exception e) {
System.out.println("Unable to read file warandpeace.txt.");
fail();
}
String text = sb.toString();
List<Integer> suffix = P2J11.buildSuffixArray(text);
String[] pats = {
"hairpin", "dearest", "chicken", "germany", "soup",
"when the butler with a bottle"
};
List<List<Integer>> expected = Arrays.asList(
// hairpin
Collections.emptyList(),
// dearest
Arrays.asList(401455, 589251, 673175, 756728, 762806, 824448, 824892,
876326, 1874525, 2097431, 2824422, 2824638, 3001502, 3069811, 3070789),
// chicken
Arrays.asList(1000200, 1322792, 1323345, 1709728, 1858789, 2112805),
// germany
Arrays.asList(149169, 1625813, 2387228, 2621602),
// soup
Arrays.asList(147783, 546991, 772564, 1954975, 2370800, 2534921,
2667437, 2751268, 3010037, 3010169),
// when the butler with a bottle
Collections.singletonList(149207)
);
for(int i = 0; i < pats.length; i++) {
String pat = pats[i];
List<Integer> expect = expected.get(i);
List<Integer> find = P2J11.find(pat, text, suffix);
for(int pos: find) {
assertTrue(text.substring(pos).startsWith(pat));
}
assertEquals(expect, find);
}
CRC32 check = new CRC32();
for(int i = 0; i < text.length(); i++) {
check.update(text.charAt(i));
}
Random rng = new Random(12345);
for(int i = 0; i < 100; i++) {
int pos = rng.nextInt(1000000);
int len = rng.nextInt(20) + 5;
String pat = text.substring(pos, pos + len);
List<Integer> find = P2J11.find(pat, text, suffix);
//System.out.println("<" + pat + ">: " + find.size());
for(int j: find) { check.update(j); }
}
assertEquals(3893756230L, check.getValue());
}
==========================================================
Warandpeace.txt sample
"Well, Prince, so Genoa and Lucca are now just family estates of the
Buonapartes. But I warn you, if you don't tell me that this means war,
if you still try to defend the infamies and horrors perpetrated by that
Antichrist--I really believe he is Antichrist--I will have nothing more
to do with you and you are no longer my friend,
- VariableReferenceNode.java, OperationNode.java, ConstantNode.java, and PatternNode.java must have their own java classes with the correct methods implemented: OperationNode: Has enum, left and Optional right members, good constructors and ToString is good VariableReferenceNode: Has name and Optional index, good constructors and ToString is good Constant & Node Pattern: Have name, good constructor and ToString is good Make sure to include the screenshot of the output of Parser.java as well.arrow_forwardThe language is Java. Given my code, I want to make my output look like the first part of the first image's output, but with the numbers in my output. The other two pictures are of my code. I want to make the output look like a tree, as opposed to being close together like that. Thanks.arrow_forwardI need help creating a java abstract class program- image described belowarrow_forward
- Stuck need help! The class I'm taking is computer science discrete structures. Problem is attached. please view attachment before answering. Really struggling with this concept. Thank you so much.arrow_forwardwhy we use the Comparable interface to sort Java Objects that do not have a "natural" ordering. Provide an examplearrow_forward
- Database System ConceptsComputer ScienceISBN:9780078022159Author:Abraham Silberschatz Professor, Henry F. Korth, S. SudarshanPublisher:McGraw-Hill EducationStarting Out with Python (4th Edition)Computer ScienceISBN:9780134444321Author:Tony GaddisPublisher:PEARSONDigital Fundamentals (11th Edition)Computer ScienceISBN:9780132737968Author:Thomas L. FloydPublisher:PEARSON
- C How to Program (8th Edition)Computer ScienceISBN:9780133976892Author:Paul J. Deitel, Harvey DeitelPublisher:PEARSONDatabase Systems: Design, Implementation, & Manag...Computer ScienceISBN:9781337627900Author:Carlos Coronel, Steven MorrisPublisher:Cengage LearningProgrammable Logic ControllersComputer ScienceISBN:9780073373843Author:Frank D. PetruzellaPublisher:McGraw-Hill Education





