Reading Word File Content from Office 365

I had a case where I needed to read a document content from Office 365 through REST API and use that on my project. More specifically I wanted to use the content in my Word Add-in. Back then I was struggling to be able to read the content in a correct way. I didn’t find a good solution on how to manipulate the data I’m getting back from REST API.

Use it in an Application

Finally, I did solve the issue and I will show you how. This is just a quick example of the functionality without a complete application. I’m having a session on upcoming Saturday 10/28/2017 in SharePoint Saturday New England at 9:00 am.

My session title is “Tools for Information Worker – Introduction to Office Add-ins Development.” On the session, I will demonstrate a complete example on how to use these techniques, and I will also share the source code of the application after the session.

http://spsnewengland.org/agenda/

Stay tuned and follow me in Twitter @mikkokoskinen to know when the application is available.

Reading in Node.js Application

But back on the solution. You are able to use the REST API call called getfilebyserverrelativeurl with the $value attribute to get document with content.

<span 				data-mce-type="bookmark" 				id="mce_SELREST_start" 				data-mce-style="overflow:hidden;line-height:0" 				style="overflow:hidden;line-height:0" 			></span>
executor.executeAsync({
  url: "<app web url>/_api/SP.AppContextSite(@target)/web
    /getfilebyserverrelativeurl('/Shared Documents/filename.docx')/$value
    ?@target='<host web url>'",
  method: "GET",
  binaryStringResponseBody: true,
  success: successHandler,
  error: errorHandler
});

More info: https://msdn.microsoft.com/en-us/library/office/dn450841.aspx

If you change the placeholders from the above REST call and navigate into that on your browser, the document will be downloaded automatically. In a case you have ever used Microsoft Graph to get documents from a document library, you may have seen a parameter called @microsoft.graph.downloadUrl. For example, this call will list documents from the library with a given id: https://graph.microsoft.com/v1.0/drives/{library-id}/root/children.

In case you didn’t know @microsoft.graph.downloadUrl gives you a short-term access to the file without a need to send authentication inside a call header etc. The URL has a temporary authentication token that is valid only for a couple of minutes. But the same thing with that URL. If you navigate to that URL, the document will be downloaded.

So how to get the content into a variable and use it? Here’s a short explanation on how you can use the call return value in an application that is using Node.js and TypeScript. Maybe your application is a Word add-in, and you want to read the document in Office 365 as a starting point for your own document.

In the example, we have a situation that you have the @microsoft.graph.downloadUrl of the file, and you want to download the content into a variable.

For easy call based on URL, we will use a module called node-fetch.
1. It’s light-weight module that brings window.fetch to Node.js
2. More information from here: https://www.npmjs.com/package/node-fetch
Run npm install –save node-fetch in the terminal window to install the module for the project.
Open the TypeScript file where you want to add a function for the call.
Add a new reference for node-fetch: import fetch = require(‘node-fetch’);

Then add a function that uses the @microsoft.graph.downloadUrl to get the content of the document.

1. The URL is sent as a parameter in the function call.
2. This function is resolving a promise so that we can use await functionality when we are calling the function.

static getTemplateDocument(templateURL: string) {
        return new Promise<string>(async (resolve, reject) => {
            let templateArray: any;
            fetch (templateURL, {body: 'buffer'}).then(res => {
                res.buffer().then( data => {
                    templateArray = data;
                    resolve(templateArray);
                });
            });
        });
    }

The important part is to set the body setting of the fetch call as a buffer. The default value for the body is empty, but we specifically want to get the content of the document.
1. When this setting is set, we can use the buffer() function of the result we are getting back from the fetch to read the data.

And the Data is?

We are almost there. The question is that what does the getTemplateDocument call actually send back to us?

The answer is that we are getting back a Uint8Array that holds the content of the template Word document. We can now use this array in a way our application needs it. In Office.js there is a function called insertFileFromBase64. With this function, we can add a content of a docx file into our current document as long as the file is base64 encoded. And because we already have the file in Uin8Array format, it’s easy to make the transformation and insert the file.

Here’s a short example code for that when we have the file back in a result attribute from the function call above.

var templateBuffer = result.data;
    var u8 = result.data;
    var b64encoded = btoa(String.fromCharCode.apply(null, u8));

    Word.run(function (context) {

        // Create a proxy object for the document.
        var thisDocument = context.document;

        // Queue a command to get the current body.
        // Create a proxy range object for the selection.
        var body = context.document.body;

        // Queue a command to replace the body.
        body.insertFileFromBase64(b64encoded, Word.InsertLocation.replace);

        // Synchronize the document state by executing the queued commands,
        // and return a promise to indicate task completion.
        return context.sync().then(function () {
            console.log('Added the content of the file .');
        });
    })
    .catch(function (error) {
        console.log('Error: ' + JSON.stringify(error));
        if (error instanceof OfficeExtension.Error) {
            console.log('Debug info: ' + JSON.stringify(error.debugInfo));
        }
    });<span 				data-mce-type="bookmark" 				id="mce_SELREST_start" 				data-mce-style="overflow:hidden;line-height:0" 				style="overflow:hidden;line-height:0" 			></span>

One thought on “Reading Word File Content from Office 365”

Dheeraj Balmuri says:

05/06/2021 at 1:11 am

Hi, I am doing a similar kind of POC. Is there a chance that i can find this work on github?

LikeLike

Rss

About This Blog

My Twitter

Blogroll

If you have time

Subscribe to Blog via Email

Reading Word File Content from Office 365

Use it in an Application

Reading in Node.js Application

And the Data is?

One thought on “Reading Word File Content from Office 365”

Leave a comment Cancel reply

Use it in an Application

Reading in Node.js Application

And the Data is?

Share this:

Related

One thought on “Reading Word File Content from Office 365”

Leave a comment Cancel reply