Skip to main content

Google OCR

AutomatR.Windows.Activities.GoogleOCR

The "Google OCR" activity in AutomatR leverages the Google Cloud Vision API, providing powerful optical character recognition (OCR) capabilities. This activity is designed to extract text from images or documents, offering a reliable solution for OCR data extraction in automation workflows.

Properties

NameDescription
Input
API KeySpecifies the API key associated with your Google Cloud Platform (GCP) project. This key is used for authentication to access the Google Cloud Vision API for OCR. String variables containing the API key.
File NameSpecifies the name of the document or image file for which OCR needs to be performed. This file can be stored locally or in a cloud storage service. String variables containing the file name.
File PathSpecifies the local path to the document or image file if it is stored locally. This property is used if the file is not in a cloud storage service. String variables containing the local file path.
Region SelectionAllows the user to select the image region to capture by clicking on the ellipsis button (...) and dragging the mouse to define the region of interest. This is particularly useful when focusing OCR on specific areas of an image. No direct variable support for region selection, as it involves user interaction.
Misc
Display NameProvides a customizable name for the activity displayed in the workflow. The display name enhances clarity and organization within the automation project. String variables containing the desired display name.
Optional
DelaySpecifies the amount of time (in seconds) to wait before executing the Google OCR activity. This can be useful for handling synchronization issues. Integer variables containing the delay duration. Ex.: If the amount of time is 1000 milliseconds or 1 sec, i.e. 1.
Output
ResultOutputs the result of the Google OCR operation, typically containing the extracted text data and additional information about the document. Variables of relevant types (e.g., string variables) to store the OCR result.

How to use:

  1. Drag and drop the "Google OCR" activity onto the workflow.
  2. Input the API key and file information in the properties pane.
  3. Specify the source of the document (local file or cloud storage).
  4. Use the region selection feature to define the area of interest within the image.
  5. Optionally, configure the delay and customize the display name.
  6. Execute the workflow to perform OCR using the Google Cloud Vision API.

Note: Ensure that the Google Cloud Vision API is enabled for your GCP project and that the API key has the necessary permissions for successful OCR operations.

Example: Consider an example where the "Google OCR" activity is used to extract text from a local image file:

Google OCR:
Display Name: "Extract Text from Image"
API Key: "your_api_key"
File Path: "C:\Images\sample.png"
Region Selection: [User Interaction]
Result: extractedText

In this example, the activity uses the Google Cloud Vision API to extract text from the "sample.png" image file. The region of interest is interactively defined by the user through the region selection feature. The extracted text is stored in the variable "extractedText" for further use in the workflow.