Skip to content

Product · Capture

Listens. Voice, photo, and document into one transcript.

Capture is the first surface a Fermito firm touches each day. Fermito listens on the phone the engineer already carries and merges voice, photos, and supporting documents into a single dated visit record before anyone gets back to the office.

Field review under OBC 1.2.2.2 is documentation work as much as engineering work. Fermito treats the documentation layer the way the practice deserves: attributable, auditable, and authored on the visit, not in the evening hours that follow.

The Fermito capture form mid-recording. The Opening paragraph textarea shows transcript text from a prior voice note while the mic at the top of the card is in active recording state.
Voice in, transcript on screen. The engineer dictates; Fermito merges every voice note into the same evidence pool the draft is assembled from.

What capture does

Five inputs, one visit record.

Each input below is a real surface in the product today. None of them are roadmap promises. Fermito firms walk sites with this loop running.

  • 01 · Voice

    Voice notes, structured at the source.

    An engineer walks the slab and talks. Fermito holds the live transcript on-device and streams chunks as the network allows. A dropped LTE signal in a stairwell does not lose the last forty seconds of context. The recording resumes against the same visit record on reconnect.

    Whisper transcription · 5-minute capture window · resume across network drops

  • 02 · Photos

    Photos with AI-assisted captions, ready for review.

    Each photo is timestamped, attributed to the visit, and run through Claude Vision for an initial caption. The engineer keeps walking instead of stopping to type. Captions are draft text; the reviewing engineer rewrites or accepts them on the way to seal.

    Claude Vision captioning · stored on Vercel Blob · embedded in the export

  • 03 · Documents

    Drawings, specs, and prior reports, extracted in place.

    Drop a structural detail PDF or a previous Word report into the visit. Fermito extracts the text server-side and merges the content into the same transcript thread the engineer is dictating against. Prior observation language is available before drafting begins.

    DOCX + PDF · 10 MB upload limit · text merged into the visit thread

  • 04 · One record

    Every input attaches to a single visit, attributed to one engineer.

    Voice, photos, and documents are not loose files. They are children of a single dated visit record, attributed to the engineer who walked the site. The visit is the unit of work. It is what generation reads from, what review references, and what the export header reflects.

    Visit = unit of work · engineer-attributed · auditable from capture to export

  • 05 · Any language

    Speak or type in your first language. The report comes out in English.

    Roughly a third of Ontario's licensed engineers were trained outside Canada. Many are more fluent dictating field observations in Farsi, Mandarin, Hindi, Arabic, Spanish, or Portuguese than composing them in English on-site. Fermito accepts voice and text input in 15+ languages. Translation happens automatically at generation time, not as a workflow step the engineer has to plan around. The engineer captures what they saw in the language that comes naturally. The draft arrives in the firm's signing language.

    15+ input languages · voice + text · Farsi, Mandarin, Hindi, Arabic, Spanish, Portuguese, and more

  • 06 · Form shape follows the doc

    Structured fields match the document being drafted.

    Free-form capture (voice, photos, documents) is the same across every sealed doc shape. The structured form fields adapt: a sealed opinion letter captures recipient, proposed load, and limitations; a Site Review Report (also written FRR or Field Review Report) captures observations, photos, and spatial anchors against the visit. The engineer sees the form that fits the work in front of them, not a lowest-common-denominator schema.

    Per-doc-shape structured fields · shared free-form capture · no re-learning per doc type

What gets fused

Voice, photo, document, and first language - one transcript.

Fermito treats every artifact from a site visit as evidence and merges them into a single thread before drafting begins. The engineer narrates, photographs, and drops in the prior PDF. Fermito reads them as one piece of context, not four input fields stitched together.

  • Voice transcript

    Live, on-device, network-resume aware. The engineer’s narration is the spine of the visit record.

  • Photo with vision caption

    Each photo is captioned and timestamped, attributed to the visit, and held against the transcript thread the engineer dictated against.

  • Extracted documents

    Drawings, prior reports, and structural details are read server-side and merged into the same thread. Prior observation language is available before drafting begins.

  • Multi-language input

    Voice and text in 15+ first languages, fused into the same transcript Fermito drafts from. The engineer captures in the language that comes naturally; the draft arrives in the firm’s signing language.

See how Fermito listens

Field documentation, by the rule

Capture is built around what PEO actually expects from a site-visit record: field review, condition assessment, envelope investigation.

The Ontario Building Code requires that field reviews under OBC 1.2.2.2 be performed by, or under the direct supervision of, the practitioner responsible for the design. Fermito captures the record in the practitioner’s voice, on the visit, attributed to the engineer present.

PEO Practice Bulletin - Field Reviews expects contemporaneous documentation: dated, attributed, and tied to the visit it describes. Capture is contemporaneous by construction. There is no “transcribe the notebook on Friday” step to fail.

Standards referenced in observations - CSA O86, CSA A23.3, CSA S16 - stay searchable in the visit record, not flattened into prose the engineer has to re-find later.

Next surface

Capture lands. Generate begins.

Once the visit record closes, Fermito reads it end to end and assembles a draft in the firm's house format - templates, callouts, and phrasing pulled from the firm's own library.