PV 168 Seminar 13 Prerequisites for the seminar ● NetBeans IDE 12.0 ○ Download from https://netbeans.apache.org/download/index.html Address Database ● Loads list of addresses in Czech Republic (DataLoader) ● Find all addresses matching given (possibly incomplete) specification (AddressFinder) ● Executes performance test to help evaluate CPU and Memory consumption (PerformanceTest) ● There are multiple implementations of AddressFinder using various data structures and search algorithms ● Concrete implementation is selected with dialog box when the application is started SimpleAddressFinder ● Stores data as simple List, no optimized structure ● Search is done sequentially, all addresses must be traversed ● Multiple search strategies: ○ ForEachSearchStrategy – based on for-each loop (#1) ○ StreamSearchStrategy – based on Streams (#2) ○ ParallelStreamSearchStrategy – based on parallel Streams. Parallel streams utilize multiple threads with Fork/Join framework. Threads count corresponds to CPU count. (#3) IndexedAddressFinder ● Stores data in map-based structure. ● The search process consist of two steps ○ Finding the collection of addresses with appropriate AddressBase (municipality, municipality district, street, and district). ○ Then finding addresses within this collection with appropriate orientation number and/or house number. ● The first step is implemented as a map lookup to avoid sequential search. ● Multiple implementations exist for the second step: ○ IndexedAddressGroup – addresses with the same AddressBase are stored in Maps, find by number(s) is done as map lookup (#4) ○ SimpleAddressGroup – addresses with the same AddressBase are stored in simple List, find by number(s) is done sequentially (#5) Profiler ● Tools for evaluation of application performance ○ CPU time ○ Memory usage ● Profiler integrated in IntelliJ IDEA does not provide good support for memory usage profiling 😕 Instructions (screenshots on next slides) ● Clone the project https://gitlab.fi.muni.cz/pv168/address-database ● Open the project in NetBeans ● Set Java Platform to JDK 11 or newer ○ Right click on project AddressDatabse in left panel and choose Properties ○ Select category Build > Compile and choose Java Platform ○ If you don’t see suitable JDK there, click on Manage Java Platforms… to add JDK ● Open profiler (menu Profile > Profile Project or Ctrl+F2) ● Configure Sesion by selecting profile Telemetry ● Run the application in profiler ● Choose AddressFinder implementation #1 ● Check the results (notice the detail numbers when hovering over the graphs) How to set JDK in NetBeans Profiler window Tips & Tricks: Snapshots ● Methods and Objects profiles allow to take snapshots of the results ○ Useful for later comparison results for different AddressFinder implementations ○ Unfortunately there is no such option for Telemetry ● Snapshot is taken if you confirm it in dialog when the application finishes ● Snapshots can be saved to disk ○ For long term comparison ● Saved snapshots can be renamed ○ In window Snapshots (menu Window > Profiling > Snapshots) ○ Better identification, e.g. AddressFinder #1 Tips & Tricks: Taking Snapshot Tips & Tricks: Saving Snapshot Tips & Tricks: Renaming Saved Snapshot ● Run the application with Telemetry profile ○ Check Memory graph to see used heap size after loading the data. ○ Check the output tab to see average time per single search ○ Write down both numbers ● Run the application with Methods profile ○ Check where the application spent most of the time in main thread ● Run the application with Objects profile ○ Check which object types occupied most of the heap Seminar Task 1 (AddressFinder #1) ● Run the application with Telemetry profile ○ Check Memory graph to see used heap size after loading the data. ○ Check the output tab to see average time per single search ○ Write down both numbers ● Run the application with Methods profile ○ Check where the application spent most of the time in main thread ● Run the application with Objects profile ○ Check which object types occupied most of the heap ● Discuss and write down answers to these questions: ○ Is there any significant difference in CPU time or memory consumption compared to #1? Seminar Task 2 (AddressFinder #2) ● Run the application with Telemetry profile ○ Check Memory graph to see used heap size after loading the data. ○ Check the output tab to see average time per single search and CPU count ○ Write down all three numbers ● Run the application with Methods profile ○ Check where the application spent most of the time in main thread ○ Check where the application spent most of the time in ForkJoinPool.* threads ● Run the application with Objects profile ○ Check which object types occupied most of the heap ● Discuss and write down answers to these questions: ○ Is there any significant difference in CPU time or memory consumption compared to #1 or #2? ○ How many ForkJoinPool.* threads were running? Seminar Task 3 (AddressFinder #3) ● Run the application with Telemetry profile ○ Check Memory graph to see used heap size after loading the data. ○ Check the output tab to see average time per single search ○ Write down both numbers ● Run the application with Methods profile ○ Check where the application spent most of the time in main thread ● Run the application with Objects profile ○ Check which object types occupied most of the heap ● Discuss and write down answers to these questions: ○ Is there any significant difference in CPU time or memory consumption compared to #1 – #3? Seminar Task 4 (AddressFinder #4) ● Run the application with Telemetry profile ○ Check Memory graph to see used heap size after loading the data. ○ Check the output tab to see average time per single search ○ Write down both numbers ● Run the application with Methods profile ○ Check where the application spent most of the time in main thread ● Run the application with Objects profile ○ Check which object types occupied most of the heap ● Discuss and write down answers to these questions: ○ Is there any significant difference in CPU time or memory consumption compared to #1 – #4? Seminar Task 5 (AddressFinder #5) ● Which implementation was the least CPU efficient (slowest) one? ● Which implementation was the most CPU efficient (fastest) one? ● How much did parallel processing in #3 help? ● What is the cost of optimization in #4? Is it worth it? ● Which implementation would you recommend to use? Seminar Task 6 (Evaluation) Link to slides https://is.muni.cz/auth/el/fi/podzim2020/PV168/um/seminare/PV168-seminar-13.pdf Conclusion Any questions?