Blog
Read the latest blogs
Find tutorials, guides and use case examples in the
Learning center
This feature is Experimental and may change based on user feedback and testing. Share your thoughts via our chatbot to help us improve it.
The Get Document Information block in Leapwork enables users to extract key metadata from documents, such as author, page count, and creation or modification dates. This functionality is essential for automating validation processes and ensuring that documents meet specific criteria.
Example: A PDF file containing a contract can be analyzed to extract the author, page count, and creation date for validation in an approval workflow.
Note: This feature is available starting from Release 2025.1.XXX.
When fully expanded, the Get Document Information block displays the following properties:
The green input connector in the header is used to trigger the block to start executing.
The green output connector in the header triggers when the file type has been successfully converted to text.
The title of the block “Get Document Information” can be changed by double-clicking on it and typing in a new title.
The Get Document Information block supports the following file types:
Users must select a supported file type when importing a document.
Once you drag the file into the block, the block will automatically recognize the file type. A user can choose any of the below options as a source type:
This field allows users to upload a file. By selecting "Import New File", a window will open to upload the document.
This parameter returns the author of the document, if available. The author is typically stored in the document metadata and can be used to validate document ownership or source.
Example: A PDF file created by "John Doe" will return:
Author: John Doe
The page count indicates the total number of pages in the document. This is useful for validation when a document must meet specific length requirements.
Example: A Word document with 5 pages will return:
Page Count: 5
This parameter retrieves the date and time when the document was originally created. It can be used to verify document versioning or compliance with time-sensitive requirements.
Example: A document created on January 15, 2024 will return:
Date Created: 2024-01-15T10:30:00Z
The last modified date reflects the most recent time the document was edited. This is useful for tracking document changes over time.
Example: If a document was last modified on February 10, 2024, it will return:
Date Modified: 2024-02-10T14:45:00Z
If a document has a title set in its metadata, this parameter will return it. The title is often used in structured documents, such as reports or legal files.
Example: A PDF document with the title Annual Report 2024 will return:
Document Title: Annual Report 2024
This connector is triggered if the Get Document Information block fails to retrieve metadata from the document. A failure can occur due to various reasons, such as:
When triggered, this connector can be used to handle failure scenarios within the automation flow.
If the Default Timeout property checkbox is not selected, then the timeout value is 10 seconds. If the Default Timeout property checkbox is selected, then the Default Timeout value selected in the flow settings will be applicable.
The maximum time spent converting the file type before giving up and triggering the Not converted connector.
Note: All cases have a global timeout that can be configured in the Settings panel. This is unrelated to the timeout of a single building block. However, a running case will automatically be cancelled if it runs for longer than the global timeout.
Created 2025.02.12
©2024, Leapwork. All rights reserved.