Structural Search and Replace: What, Why, and How-to

Rationale: Imagine that we have a large source code-base that we need to browse or modify it. For instance, we might want to use a library and find out how it works, or we might need to get acquainted with existing code and to modify it. Yet another example is that a new JDK becomes available and we are keen to see the changes in the standard Java libraries and so on. Conventional tools like find and replace text may not completely address these goals because when we use them, it is easy to find or replace too much or too little. Of course, if someone already knows the source code well, then using the whole words option and regular expressions may help make our find-and-replace queries smarter.

The problem of the conventional approach, even with regular expressions, is that they just do not know anything about the syntax and semantics of the source code we are using. This is why we combined the search-and-replace feature with knowledge about the source code, producing the Structural Search and Replace (SSR) feature.

Important notes about the user interface and how SSR works

Code templates

The user interface of SSR (Figure 1 and 2) was made as close to conventional search-and-replace as possible. We still need to enter a search string and maybe a replacement string, and specify the case sensitive option. However the similarities end at this point. The search and replace strings in SSR will be, in fact, the code fragments ( templates ) we would like to find or replace. Any template entered should be a well formed Java construction of one of the following types:

  1. An expression, like new ProcessCancelledException()
  2. A statement or sequence of statements, like: a = b;
  3. A class, e.g. class A implements B {}
  4. A comment or javadoc comment, e.g. /** @beaninfo */

Note: The Copy existing templates button allows you to quickly pick up one of many pre-built Java construction templates (class, methods, ifs, etc) and user defined templates (if any), so quite often there is no need to enter code patterns unless there is some selection of existing source code.

The matching of the template code with source code is accomplished mostly according to Java syntax rules. This implies, for instance, that white spacing of the template and source code is not significant. Certain semantic knowledge is also applied during search, e.g. the order of the class fields, methods or references in an implements list is not significant.

The matching of the first two template types (see the list of 4 types above) is accomplished strictly, i.e. a match is found when an exact occurrence of the code is found. On the other hand, matching of the third and forth template types is done loosely, meaning that a match could have other content not mentioned in the template. For instance, the search template new Runnable() {} will find all anonymous Runnable instances. The same convenience shorthand is applicable for method bodies. Thus, the following search template will find any Runnable with a method called someMethod:

new Runnable() {  void someMethod();  }

References to classes, fields, variables, methods, etc. are treated literally (e.g. search template a = b; matches only a = b;) except for a case mentioned below.

Note: The SSR operates over the concrete syntax trees of the source code and the supplied code templates. Thus, using code templates with errors or applying SSR to source code with errors ("red") could cause unexpected or undesired effects.

figure 1
Figure 1 Structural Search dialog
Maxim's photo

Maxim Mossienko

Maxim is a Senior Software Developer working in IntelliJ IDEA project where he is responsible for the support of serverside programming. Currently he builds css and javascript integration within the IDE.

Contact Maxim via email: maxim(.)mossienko