Java: Highlight Text or Get Highlighted Text in Word Documents

The “Highlight” function in Microsoft Word enables you to add a bright background color to specific words or phrases. This is very useful when you want to emphasize important information in a Word document. In addition to highlighting text, Microsoft Word also allows you to find highlighted text through the “Find” function. In this article, I will explain how to programmatically highlight text or get highlighted text in Word documents in Java using Free Spire.Doc for Java library.

  • Highlight The First Occurrence of a Specific Text in a Word Document
  • Highlight All Occurrences of a Specific Text in a Word Document
  • Get Highlighted Text in a Word Document

Add Dependencies

Method 1: If you are using maven, you can easily import the JAR file of Free Spire.Doc for Java into your application by adding the following code to your project’s pom.xml file.

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.doc.free</artifactId>
        <version>5.2.0</version>
    </dependency>
</dependencies>

Method 2: If you are not using maven, you can download Spire.Doc for Java from this link, extract the zip file and then import the Spire.Doc.jar file under the lib folder into your project as a dependency.

Highlight The First Occurrence of a Specific Text in a Word Document using Java

The following steps demonstrate how to highlight the first occurrence of a specific text in a Word document:

  • Initialize an instance of the Document class.
  • Load a Word document using Document.loadFromFile() method.
  • Find the first occurrence of a specific text using Document.findString() method.
  • Get the found text as a single text range using TextSelection.getAsOneRange() method.
  • Set a highlight color for the text range using TextRange.getCharacterFormat().setHighlightColor() method.
  • Save the result document using Document.saveToFile() method.
import com.spire.doc.Document;
import com.spire.doc.FileFormat;
import com.spire.doc.documents.TextSelection;
import com.spire.doc.fields.TextRange;

import java.awt.*;

public class HighlightText {
    public static void main(String []args){
        //Create a Document instance
        Document document = new Document();
        //Load a Word document
        document.loadFromFile("Input1.docx");

        //Find the first occurrence of a specific text
        TextSelection seletion = document.findString("World Cup", false, true);

        //Get the found text as a single text range
        TextRange textRange = seletion.getAsOneRange();
        //Set a highlight color
        textRange.getCharacterFormat().setHighlightColor(Color.YELLOW);

        //Save the result document
        document.saveToFile("HighlightText.docx", FileFormat.Docx_2013);
    }
}

Highlight All Occurrences of a Specific Text in a Word Document using Java

The following steps demonstrate how to highlight all the occurrences of a specific text in a Word document:

  • Initialize an instance of the Document class.
  • Load a Word document using Document.loadFromFile() method.
  • Find all the occurrences of a specific text using Document.findAllString() method.
  • Iterate through the found text.
  • Get each text as a single text range using TextSelection.getAsOneRange() method.
  • Set a highlight color for the text range using TextRange.getCharacterFormat().setHighlightColor() method.
  • Save the result document using Document.saveToFile() method.
import com.spire.doc.Document;
import com.spire.doc.FileFormat;
import com.spire.doc.documents.TextSelection;
import com.spire.doc.fields.TextRange;

import java.awt.*;

public class HighlightAllMatchedText {
    public static void main(String []args){
        //Create a Document instance
        Document document = new Document();
        //Load a Word document
        document.loadFromFile("Input1.docx");

        //Find all occurrences of a specific text
        TextSelection[] text = document.findAllString("World Cup", false, true);

        //Iterate through the found text
        for(TextSelection selection : text){
            //Get each text as a single text range
            TextRange textRange = selection.getAsOneRange();
            //Set a highlight color
            textRange.getCharacterFormat().setHighlightColor(Color.YELLOW);
        }

        //Save the result document
        document.saveToFile("HighlightAllMatchedText.docx", FileFormat.Docx_2013);
    }
}

Get Highlighted Text in a Word Document using Java

The following steps demonstrate how to get all the text highlighted with a specific color in a Word document:

  • Initialize an instance of the Document class.
  • Load a Word document using Document.loadFromFile() method.
  • Initialize an instance of the StringBuilder class.
  • Iterate through all sections in the document.
  • Iterate through all paragraphs in each section.
  • Iterate through all child objects in each paragraph.
  • Check if the current child object is of TextRange type.
  • If the result is true, typecast the child object as TextRange.
  • Find the text range highlighted with a specific color using TextRange.getCharacterFormat().getHighlightColor() method.
  • Get the text through TextRange.Text property and then add it to the StringBuilder.
  • Write the text in the StringBuilder into a text file.
import com.spire.doc.Document;
import com.spire.doc.DocumentObject;
import com.spire.doc.Section;
import com.spire.doc.documents.Paragraph;
import com.spire.doc.fields.TextRange;

import java.awt.*;
import java.io.FileWriter;
import java.io.IOException;

public class GetHighlightedText {
    public static void main(String []args) {
        //Create a Document instance
        Document document = new Document();
        //Load a Word document
        document.loadFromFile("Input2.docx");

        StringBuilder sb = new StringBuilder();
        //Loop through all sections in the document
        for(Section section :(Iterable<Section>) document.getSections()){
            //Loop through all paragraphs in each section
            for(Paragraph paragraph : (Iterable<Paragraph>) section.getBody().getParagraphs())
            {
                //Loop through all child objects in each paragraph
                for(DocumentObject obj : (Iterable<DocumentObject>) paragraph.getChildObjects())
                {
                    //Check if the current child object is of TextRange type
                    if (obj instanceof TextRange)
                    {
                        TextRange textRange = (TextRange) obj;
                        //Check if the text range is highlighted with a specific color
                        if (textRange.getCharacterFormat().getHighlightColor().equals(Color.YELLOW)){
                            //Get the highlighted text
                            String highlightedText = textRange.getText();
                            sb.append(highlightedText + "\n");
                        }
                    }
                }
            }
        }

        FileWriter writer;
        try {
            //Save the highlighted text to a text file
            writer = new FileWriter("HighlightedText.txt");
            writer.write(sb.toString());
            writer.flush();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

C#/VB.NET: Highlight Text or Get Highlighted Text in Word Documents

Highlighting text is a great way to make certain text stand out in a document. If you want to quickly draw your readers’ attention to some important information in your Word document, you can highlight it with a bright color. Sometimes, you may also want to find the text highlighted with a specific color in a Word document. In this article, I will demonstrate how to highlight text or get highlighted text in Word documents using C# and VB.NET.

  • Highlight The First Occurrence of a Specific Text in a Word Document
  • Highlight All Occurrences of a Specific Text in a Word Document
  • Get Highlighted Text in a Word Document

Installation

To highlight or get highlighted text in Word documents, this article uses Free Spire.Doc for .NET. You can install Free Spire.Doc for .NET via NuGet by selecting Tools > NuGet Package Manager > Package Manager Console, and then executing the following command:

PM> Install-Package FreeSpire.Doc

Alternatively, you can also download the DLL files of Free Spire.Doc for .NET from the official website, extract the package and then add the DLL files under the Bin folder to your project as references.

Highlight The First Occurrence of a Specific Text in a Word Document using C# and VB.NET

The following steps demonstrate how to highlight the first occurrence of a specific text in a Word document:

  • Initialize an instance of the Document class.
  • Load a Word document using Document.LoadFromFile() method.
  • Find the first occurrence of a specific text using Document.FindString() method.
  • Get the found text as a single text range using TextSelection.GetAsOneRange() method.
  • Set a highlight color for the text range through TextRange.CharacterFormat.HighlightColor property.
  • Save the result document using Document.SaveToFile() method.

C#

using Spire.Doc;
using Spire.Doc.Documents;
using Spire.Doc.Fields;
using System.Drawing;

namespace HighlightText
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a Document instance
            Document document = new Document();
            //Load a Word document 
            document.LoadFromFile("Input1.docx");

            //Find the first occurrence of a specific text
            TextSelection seletion = document.FindString("World Cup", false, true);

            //Get the found text as a single text range
            TextRange textRange = seletion.GetAsOneRange();
            //Set a highlight color
            textRange.CharacterFormat.HighlightColor = Color.Yellow;

            //Save the result document
            document.SaveToFile("HighlightText.docx", FileFormat.Docx2013);
        }
    }
}

VB.NET

Imports Spire.Doc
Imports Spire.Doc.Documents
Imports Spire.Doc.Fields
Imports System.Drawing

Namespace HighlightText
    Friend Class Program
        Private Shared Sub Main(ByVal args As String())
            'Create a Document instance
            Dim document As Document = New Document()
            'Load a Word document 
            document.LoadFromFile("Input1.docx")

            'Find the first occurrence of a specific text
            Dim seletion As TextSelection = document.FindString("World Cup", False, True)

            'Get the found text as a single text range
            Dim textRange As TextRange = seletion.GetAsOneRange()
            'Set a highlight color
            textRange.CharacterFormat.HighlightColor = Color.Yellow

            'Save the result document
            document.SaveToFile("HighlightText.docx", FileFormat.Docx2013)
        End Sub
    End Class
End Namespace
Highlight the first occurrence of a specific text in Word using C# or VB.NET

Highlight All Occurrences of a Specific Text in a Word Document using C# and VB.NET

The following steps demonstrate how to highlight all the occurrences of a specific text in a Word document:

  • Initialize an instance of the Document class.
  • Load a Word document using Document.LoadFromFile() method.
  • Find all the occurrences of a specific text using Document.FindAllString() method.
  • Loop through the found text.
  • Get each text as a single text range using TextSelection.GetAsOneRange() method.
  • Set a highlight color for the text range through TextRange.CharacterFormat.HighlightColor property.
  • Save the result document using Document.SaveToFile() method.

C#

using Spire.Doc;
using Spire.Doc.Documents;
using Spire.Doc.Fields;
using System.Drawing;

namespace HighlightAllMatchedText
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a Document instance
            Document document = new Document();
            //Load a Word document 
            document.LoadFromFile("Input1.docx");

            //Find all occurrences of a specific text
            TextSelection[] text = document.FindAllString("World Cup", false, true);

            //Loop through the found text
            foreach (TextSelection seletion in text)
            {
                //Get each text as a single text range
                TextRange textRange = seletion.GetAsOneRange();
                //Set a highlight color
                textRange.CharacterFormat.HighlightColor = Color.Yellow;
            }

            //Save the result document
            document.SaveToFile("HighlightAllMatchedText.docx", FileFormat.Docx2013);
        }
    }
}

VB.NET

Imports Spire.Doc
Imports Spire.Doc.Documents
Imports Spire.Doc.Fields
Imports System.Drawing

Namespace HighlightAllMatchedText
    Friend Class Program
        Private Shared Sub Main(ByVal args As String())
            'Create a Document instance
            Dim document As Document = New Document()
            'Load a Word document 
            document.LoadFromFile("Input1.docx")

            ' Find all occurrences of a specific text
            Dim text As TextSelection() = document.FindAllString("World Cup", False, True)

            'Loop through the found text
            For Each seletion As TextSelection In text
                'Get each text as a single text range
                Dim textRange As TextRange = seletion.GetAsOneRange()
                'Set a highlight color
                textRange.CharacterFormat.HighlightColor = Color.Yellow
            Next

            'Save the result document
            document.SaveToFile("HighlightAllMatchedText.docx", FileFormat.Docx2013)
        End Sub
    End Class
End Namespace
Highlight all occurrences of a specific text in Word using C# or VB.NET

Get Highlighted Text in a Word Document using C# and VB.NET

The following steps demonstrate how to get all the text highlighted with a specific color in a Word document:

  • Initialize an instance of the Document class.
  • Load a Word document using Document.LoadFromFile() method.
  • Iterate through all sections in the document.
  • Iterate through all paragraphs in each section.
  • Iterate through all child objects in each paragraph.
  • Check if the current child object is of TextRange type.
  • If the result is true, typecast the child object as TextRange.
  • Check if the text range is highlighted with a specific color through TextRange.CharacterFormat.HighlightColor property.
  • If the result is true, get the text through TextRange.Text property.

C#

using Spire.Doc;
using Spire.Doc.Documents;
using Spire.Doc.Fields;
using System;
using System.Drawing;

namespace GetHighlightedText
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a Document instance
            Document document = new Document();
            //Load a Word document
            document.LoadFromFile("Input2.docx");

            //Loop through all sections in the document
            foreach (Section section in document.Sections)
            {
                //Loop through all paragraphs in each section
                foreach (Paragraph paragraph in section.Body.Paragraphs)
                {
                    //Loop through all child objects in each paragraph
                    foreach (DocumentObject obj in paragraph.ChildObjects)
                    {
                        //Check if the current child object is of TextRange type
                        if (obj is TextRange)
                        {
                            TextRange textRange = obj as TextRange;
                            //Check if the text range is highlighted with a specific color
                            if (textRange.CharacterFormat.HighlightColor == Color.Yellow)
                            {
                                //Get the highlighted text
                                string highlightedText = textRange.Text;
                                Console.WriteLine(highlightedText+"\n");
                            }
                        }
                    }
                }
            }

            Console.ReadKey();
        }
    }
}

VB.NET

Imports Spire.Doc
Imports Spire.Doc.Documents
Imports Spire.Doc.Fields
Imports System.Drawing

Namespace GetHighlightedText
    Friend Class Program
        Private Shared Sub Main(ByVal args As String())
            'Create a Document instance
            Dim document As Document = New Document()
            'Load a Word document
            document.LoadFromFile("Input2.docx")

            'Loop through all sections in the document
            For Each section As Section In document.Sections
                'Loop through all paragraphs in each section
                For Each paragraph As Paragraph In section.Body.Paragraphs
                    'Loop through all child objects in each paragraph
                    For Each obj As DocumentObject In paragraph.ChildObjects
                        'Check if the current child object is of TextRange type
                        If TypeOf obj Is TextRange Then
                            Dim textRange As TextRange = TryCast(obj, TextRange)
                            'Check if the text range is highlighted with a specific color
                            If textRange.CharacterFormat.HighlightColor = Color.Yellow Then
                                'Get the highlighted text
                                Dim highlightedText As String = textRange.Text
                                Console.WriteLine(highlightedText & vbLf)
                            End If
                        End If
                    Next
                Next
            Next

            Console.ReadKey()
        End Sub
    End Class
End Namespace
Get highlighted text in Word using C# or VB.NET

Add, Remove, Replace or Extract Images in PDF in Java

In some cases, you may need to add images to a PDF, for example, when you produce a brochure or other publication that contains vivid images. In some other cases, you may need to remove images from a PDF, for example, if you want to remove useless images from the PDF to reduce its file size. This article will demonstrate how to add, remove, replace or extract images in PDF in Java using Spire.PDF for Java.

Add Dependencies

To manipulate images in PDF, this article uses Spire.PDF for Java. If you are using maven, you can install the jar of Spire.PDF for Java into your project by adding the following code to your project’s pom.xml file.

<repositories>   
    <repository>   
        <id>com.e-iceblue</id>   
        <name>e-iceblue</name>   
        <url>http://repo.e-iceblue.com/nexus/content/groups/public/</url>   
    </repository>   
</repositories>   
<dependencies>   
    <dependency>   
        <groupId> e-iceblue </groupId>   
        <artifactId>spire.pdf </artifactId>   
        <version>8.11.0</version>   
    </dependency>   
</dependencies>

If you are not using maven, you can download Spire.PDF for Java from this website, extract the package and then import the Spire.Pdf.jar under the lib folder into your project as a dependency.

Add an Image to a PDF in Java

The following are the main steps to add an image to a PDF document:

  • Initialize an instance of the PdfDocument instance.
  • Load a PDF document from file using PdfDocument.loadFromFile() method.
  • Get a specific page by its index using PdfDocument.getPages().get(int) method.
  • Load an image using PdfImage.fromFile() method.
  • Draw the image to a specific location on the page using PdfPageBase.getCanvas().drawImage() method.
  • Save the result document using PdfDocument.SaveToFile() method.
import com.spire.pdf.FileFormat;
import com.spire.pdf.PdfDocument;
import com.spire.pdf.PdfPageBase;
import com.spire.pdf.graphics.PdfImage;

public class AddImage {
    public static void main(String []args){
        //Create a PdfDocument instance
        PdfDocument pdf = new PdfDocument();
        //Load a PDF document
        pdf.loadFromFile("input.pdf");

        //Get the first page
        PdfPageBase page = pdf.getPages().get(0);

        //Load an image
        PdfImage image = PdfImage.fromFile("image.jpg");

        //Set the width and height of image
        float width = image.getWidth() * 0.50f;
        float height = image.getHeight() * 0.50f;

        //Define a position to draw image
        double x = (page.getCanvas().getClientSize().getWidth() - width) / 2;
        float y = 60f;

        //Draw the image onto a speicific position on the page
        page.getCanvas().drawImage(image, x, y, width, height);

        //Save the result document
        pdf.saveToFile("addImage.pdf", FileFormat.PDF);
    }
}

Remove a Specific Image from a PDF in Java

The following are the steps to remove a specific image from a PDF document:

  • Initialize an instance of the PdfDocument class.
  • Load a PDF document using PdfDocument.loadFromFile() method.
  • Get the desired page in the PDF document by its index using PdfDocument.getPages().get(int) method.
  • Delete a specific image on the page by its index using PdfPageBase.deleteImage(int) method.
  • Save the result document using PdfDocument.saveToFile() method.
import com.spire.pdf.FileFormat;
import com.spire.pdf.PdfDocument;
import com.spire.pdf.PdfPageBase;

public class RemoveImage {
    public static void main(String []args){
        //Create a PdfDocument instance
        PdfDocument pdf = new PdfDocument();
        //Load a PDF document
        pdf.loadFromFile("addImage.pdf");

        //Get the first page
        PdfPageBase page = pdf.getPages().get(0);

        //Delete the first image on the page
        page.deleteImage(0);

        //Save the result document
        pdf.saveToFile("removeImage.pdf", FileFormat.PDF);
    }
}

Replace an Image in a PDF in Java

The following are the steps to replace an image with another image in a PDF document:

  • Initialize an instance of the PdfDocument class.
  • Load a PDF document using PdfDocument.loadFromFile() method.
  • Get the desired page in the PDF document by its index using PdfDocument.getPages().get(int) method.
  • Load an image using PdfImage.fromFile() method.
  • Replace a specific image with the loaded image using PdfPageBase.replaceImage(int, PdfImage) method.
  • Save the result document using PdfDocument.saveToFile() method.
import com.spire.pdf.FileFormat;
import com.spire.pdf.PdfDocument;
import com.spire.pdf.PdfPageBase;
import com.spire.pdf.graphics.PdfImage;

public class ReplaceImage {
    public static void main(String []args){
        //Create a PdfDocument instance
        PdfDocument doc = new PdfDocument();
        //Load a pdf document
        doc.loadFromFile("addImage.pdf");

        //Get the first page
        PdfPageBase page = doc.getPages().get(0);

        //Load an image
        PdfImage image = PdfImage.fromFile("newImage.jpg");

        //Replace the first image with the loaded image
        page.replaceImage(0, image);

        //Save the result document
        doc.saveToFile("replaceImage.pdf", FileFormat.PDF);
    }
}

Extract All Images from a PDF in Java

The following are the main steps to extract all images from a PDF document:

  • Initialize an instance of the PdfDocument class.
  • Load a PDF document using PdfDocument.loadFromFile() method.
  • Loop through all pages in the document.
  • Extract images from each page using PdfPageBase.extractImages() method.
  • Save the extracted images to a specific file path.
import com.spire.pdf.PdfDocument;
import com.spire.pdf.PdfPageBase;

import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;

public class ExtractImage {
    public static void main(String []args) throws IOException {
        //Create a PdfDocument instance
        PdfDocument doc = new PdfDocument();
        //Load a pdf document
        doc.loadFromFile("addImage.pdf");

        //Declare an int variable
        int index = 0;
        //Loop through the pages
        for (PdfPageBase page : (Iterable<PdfPageBase>) doc.getPages()) {
            //Extract images from the current page
            for (BufferedImage image : page.extractImages()) {
                //Specify the output file path
                File output = new File("images/" + String.format("image_%d.jpg", index++));
                //Save image as .jpg file
                ImageIO.write(image, "JPG", output);
            }
        }
    }
}

Compare Word Documents in Java

After you send a Word document out for review, you may get a revised version back. If the reviewer doesn’t turn on the “Track Changes” option, you may need to compare the original version and the revised version of the document to determine what changes have been made. In this article, I will demonstrate how to compare Word documents in Java.

The following are the topics covered in this article:

  • Compare Two Word Documents and Highlight Changes in Another Document
  • Ignore Formatting Changes During the Comparison in Java
  • Compare Two Word Documents and Save the Changes to a Text File

Add Dependencies

This article uses Spire.Doc for Java to achieve the comparison. If you are using maven, you can install the jar of Spire.Doc for Java from maven into your project by adding the following code to your project’s pom.xml file.

<repositories>    
    <repository>    
        <id>com.e-iceblue</id>    
        <name>e-iceblue</name>    
        <url>http://repo.e-iceblue.com/nexus/content/groups/public/</url>    
    </repository>    
</repositories>    
<dependencies>    
    <dependency>    
        <groupId> e-iceblue </groupId>    
        <artifactId>spire.doc </artifactId>    
        <version>10.10.7</version>    
    </dependency>    
</dependencies>

If you are not using maven, you can download Spire.Doc for Java from this website, unzip the package and then import the Spire.Doc.jar under the lib folder into your project as a dependency.

Compare Two Word Documents and Highlight Changes in Another Document in Java

Spire.Doc for Java provides the Document.compare() method for developers to compare two Word documents. This feature will mark up the changes between the two documents as “Tracked Changes”, so you can easily identify the changes and choose to accept or reject the changes later.

The following steps show how to compare two Word documents and show changes as tracked changes in a new document:

  • Create an instance of Document class and load the original Word document using Document.loadFromFile() method.
  • Create an instance of Document class and load the revised document using Document.loadFromFile() method.
  • Call Document.compare() method to compare the two Word documents.
  • Save the result to a new document using Document.saveToFile() method.

Code example:

import com.spire.doc.Document;
import com.spire.doc.FileFormat;

public class CompareWordDocuments {
    public static void main(String[] args){
        //Create a Document instance
        Document doc1 = new Document();
        //Load the original document
        doc1.loadFromFile("Original.docx");

        //Create a Document instance
        Document doc2 = new Document();
        //Load the revised document
        doc2.loadFromFile("Revised.docx");

        //Compare the two Word documents
        doc1.compare(doc2, "Author");

        //Save the result document
        doc1.saveToFile("Compare.docx", FileFormat.Docx_2013);
    }
}

Ignore Formatting Changes During the Comparison in Java

The changes between two Word documents can be content changes as well as formatting changes. If you only want to find out the content changes and dismiss the formatting changes during the comparison, you can use the setIgnoreFormatting() method of the CompareOptions class.

The following steps show how to ignore formatting changes during document comparison:

  • Create an instance of Document class and load the original Word document using Document.loadFromFile() method.
  • Create an instance of Document class and load the revised document using Document.loadFromFile() method.
  • Create an instance of CompareOptions class and call CompareOptions.setIgnoreFormatting(true) method to ignore formatting changes.
  • Compare the two documents but ignore formatting changes using Document.compare(filepath, authorName, CompareOptions) method.
  • Save the result document using Document.saveToFile() method.

Code example:

import com.spire.doc.Document;
import com.spire.doc.FileFormat;
import com.spire.doc.documents.comparison.CompareOptions;

public class IgnoreFormattingChangesDuringComparison {
    public static void main(String[] args){
        //Create a Document instance
        Document doc1 = new Document();
        //Load the first Word document
        doc1.loadFromFile("Original.docx");

        //Create a Document instance
        Document doc2 = new Document();
        //Load the second Word document
        doc2.loadFromFile("Revised.docx");

        //Create a CompareOptions instance
        CompareOptions options = new CompareOptions();
        //Set ignore formatting as true
        options.setIgnoreFormatting(true);
        //Compare the two Word documents with compare option
        doc1.compare(doc2, "Author", options);

        //Save the result document
        doc1.saveToFile("Compare1.docx", FileFormat.Docx_2013);
    }
}

Compare Two Word Documents and Save the Changes to a Text File

You can get the details of the changes such as text and type (insertion/deletion) between two Word documents and save them into a text file by referring to the steps below:

  • Create an instance of Document class and load the original Word document using Document.loadFromFile() method.
  • Create an instance of Document class and load the revised document using Document.loadFromFile() method.
  • Compare two documents using Document.compare() method.
  • Create an instance of DifferRevisions class to get the revisions.
  • Get the insertion revisions into a List using DifferRevisions.getInsertRevisions() method.
  • Get the deletion revisions into a List using DifferRevisions.getDeleteRevisions() method.
  • Create two instances of StringBuilder class for storing the insertion and deletion revisions respectively.
  • Loop through all insertion/deletion revisions in the insertion/deletion revisions list, and save them into the StringBuilder instances.
  • Write the text in the StringBuilder instances into a text file.

Code example:

import com.spire.doc.DifferRevisions;
import com.spire.doc.Document;
import com.spire.doc.fields.TextRange;

import java.io.FileWriter;
import java.io.IOException;
import java.util.List;

public class GetChangesInText {
    public static void main(String[] args) throws IOException {
        //Load one Word document
        Document doc1 = new Document();
        doc1.loadFromFile("Original.docx");

        //Load the other Word document
        Document doc2 = new Document();
        doc2.loadFromFile("Revised.docx");

        //Compare the two Word documents
        doc1.compare(doc2, "Author");

        //Get the revisions
        DifferRevisions differRevisions = new DifferRevisions(doc1);

        //Return the insertion revisions in a list
        List insertRevisionsList = differRevisions.getInsertRevisions();

        //Return the deletion revisions in a list
        List  deleteRevisionsList = differRevisions.getDeleteRevisions();

        //Create two int variables
        int m = 0;
        int n = 0;

        StringBuilder insertRevisions = new StringBuilder();
        StringBuilder deleteRevisions = new StringBuilder();
        //Loop through the insertion revision list
        for (int i = 0; i < insertRevisionsList.size(); i++)
        {
            if (insertRevisionsList.get(i) instanceof TextRange)
            {
                m += 1;
                //Get the specific revision and get its content
                TextRange textRange = (TextRange)insertRevisionsList.get(i) ;
                insertRevisions.append("Insertion #" + m + ":" + textRange.getText() + "\n");
            }
        }
        //Loop through the deletion revision list
        for (int i = 0; i < deleteRevisionsList.size() ; i++)
        {
            if (deleteRevisionsList.get(i) instanceof TextRange)
            {
                n += 1;
                //Get the specific revision and get its content
                TextRange textRange = (TextRange) deleteRevisionsList.get(i) ;
                deleteRevisions.append("Deletion #" + n + ":" + textRange.getText() + "\n");
            }
        }

        //Write the insert and delete revisions into a .txt file
        FileWriter writer1 = new FileWriter("Changes.txt");
        writer1.write("Insert revisions:" + "\n");
        writer1.write(insertRevisions.toString());
        writer1.write("==============================" + "\n");
        writer1.write("Delete revisions:" + "\n");
        writer1.write(deleteRevisions.toString());
        writer1.flush();
        writer1.close();
     }
}
Design a site like this with WordPress.com
Get started