API

Bulk Editing Posts at WordPress.com with the REST API

A little while ago I migrated my personal blog over to WordPress.com – and didn’t notice for quite some time that there were some issues in the body text of some of the older posts (the blog has several thousand posts). If the blog had been hosted on my own server, I could have just written a script to do a database update on the content, but it is hosted at wordpress.com – so that wasn’t an option.

I had a play with the WordPress REST API, and am happy to report that it allowed me to not only load all of the posts from my blog via a script, but also update them.

The script below is purely a guide – it will not work “out of the box”, as you will see if you read the various notes. It’s a template you can fashion to do what you want by adding the various pieces together. In my “real” version, all of the snippets are in one script, one after another.

Oh – and finally – worth noting that this is PHP, and I ran it at the command line in a virtual machine running Ubuntu Server 16.x, spun up at Digital Ocean, and then destroyed afterwards. It cost pennies for the time it existed. The only installs I had to do on the VM were PHP 7, and PHP CURL. There would be nothing to stop you converting it into a PHP script running in a browser, except you would probably hit time-outs. The nice thing about running it at the command line is you get to see progress as it runs.

Get an Access Token

Although some methods of the WordPress API (such as retrieving sites, and posts) require no authentication, we will be calling update later – so will need to get an access token. To do this you have to configure an application at developer.wordpress.com/apps, which will give you a Client ID, and a Client Secret string (the snippet below should be self explanatory).

$client_id = '...';
$client_secret = '...';
$site_url = 'your_blog_name.wordpress.com';
$username = '...';
$password = '...';

// get an access token
$curl = curl_init( 'https://public-api.wordpress.com/oauth2/token' );
curl_setopt( $curl, CURLOPT_POST, true );
curl_setopt( $curl, CURLOPT_POSTFIELDS, array(
    'client_id' => $client_id,
    'client_secret' => $client_secret,
    'grant_type' => 'password',
    'username' => $username,
    'password' => $password,
) );
curl_setopt( $curl, CURLOPT_RETURNTRANSFER, 1);
$auth = curl_exec( $curl );
$auth = json_decode($auth);
$access_token = $auth->access_token;

print "Access Token [".$access_token."]\r\n\r\n";

Get Site Information

The REST API call to retrieve posts needs the internal WordPress ID of your site – to get this you need to call the Sites API.

// get site info
$site_options = array (
    'http' =>
    array (
    'ignore_errors' => true,
    ),
);
$site_context = stream_context_create( $site_options );
$site_response = file_get_contents(
    'https://public-api.wordpress.com/rest/v1.2/sites/'.$site_url.'/',
    false,
    $site_context
);
$site_response = json_decode( $site_response );
$site_id = $site_response->ID;

Retrieve the Posts and Update Them

To get hold of the posts from the blog, we need to repeatedly call the posts API, with a number of parameters – essentially the number of posts to grab in each iteration, and the number of pages to try and loop through. There are a number of ways of iterating the pages – I have gone with a very hacky way that suited my needs – you could be far more clever, and use the page_handle data that comes back with the response data.

// configuration parameters
$posts_per_page = 20;
$pages = 200;
$search_pattern = "..."; // the pattern to identify content within a post that needs updating
$replace_search_pattern = "..."; // the replacement search pattern (regex)
$replace_pattern = "..."; // the replacement pattern (regex)

// setup the post context
$posts_options = array ( 'http' => array ('ignore_errors' => true, ),);
$posts_context = stream_context_create( $posts_options );

// loop through the pages
for ($page=1; $page<$pages; $page++)
{
    $posts_url = 'https://public-api.wordpress.com/rest/v1.1/sites/'.$site_url.'/posts/?page='.$page.'&number='.$posts_per_page .'&fields=ID,title,content';
    $posts_response = file_get_contents( $posts_url, false, $posts_context);
    $posts_response = json_decode( $posts_response );
    for ($i=0; $iposts);$i++) {
        $post = $posts_response->posts[$i];
        print " - ".$post->ID." ".$post->title;

        // does the post have a pattern match in it ?
        $match_result = preg_match($search_pattern,$post->content);
        if ($match_result > 0) {
            print " MATCH FOUND";
            $post_id = $post->ID;
            $updated_content = preg_replace($replace_search_pattern, $replace_pattern, $post->content);

            print "\r\n\r\n".$updated_content."\r\n\r\n";

            // do the update
            $update_options = array (
                'http' => array (
                    'ignore_errors' => true,
                    'method' => 'POST',
                    'header' => array (
                        0 => 'authorization: Bearer '.$access_token,
                        1 => 'Content-Type: application/x-www-form-urlencoded',
                    ),
                'content' => http_build_query( array (
                    'content' => $updated_content,
                    )),
                ),
            );

            $update_context = stream_context_create( $update_options );
            $update_response = file_get_contents('https://public-api.wordpress.com/rest/v1.2/sites/'.$site_id.'/posts/'.$post_id,false,$update_context);
            $update_response = json_decode( $update_response );

            print " UPDATED";
        }

        print "\r\n";
    }
}

It’s a little bit technical in places, but most of this code was lifted from the WordPress API documentation. As I said at the start – this is not a working solution that you can just paste in – it’s a guide to how you can interract with the WordPress.com API from PHP. Hopefully it will be useful to somebody else at some point.

Posted by Jonathan Beckett in Notes, 0 comments

Problems with Breaking Inheritance and Limited Access User Permission Lock Down Mode in SharePoint

What is “Limited Access User Permission Lock Down Mode” ?

Lets start this by describing a little known site collection feature called “Limited Access User Permission Lock Down Mode”. When enabled, it stops users from viewing the list that a file they have been given specific access to exists within. In some cases it seems to stop Microsoft Office from working correctly too.

The reason you might use it, is to allow a user read access to a specific file within a SharePoint Library, but not let them modify the URL in order to see the list – essentially only the URL to the file will work for them.

If you switch off the site collection feature, the user will be able to at least see the library within SharePoint containing the file they have access to.

How does this relate to Permissions ?

It just so happens I developed a PowerShell script for a client that creates sub-sites for projects – breaking permissions inheritance on each sub-site, and wiring up custom groups, and permissions for them for each sub-site (e.g. “Project A”, with groups “Project A Owners”, “Project A Members”, and so on).

It turns out the method used in the PowerShell script to break permissions inheritance on the sub-site was incorrect (although advocated by Microsoft I might add).

I used the following method :

$web.RoleDefinitions.BreakInheritance($true,$false)

It turns out this does something that is impossible through the SharePoint interface – it not only breaks inheritance, and copies the Group assignments to the subsite, it also breaks inheritance of the Permission Levels (aka “permission sets”), and creates new permission sets tied to the sub-site with the same names as the parent. The tell-tale that this has happened is that checkboxes appear next to the permission set names when viewed from the sub-site (via “view site permissions”).

Why is this important? Because when the permission sets are copied, the configuration of the Limited Access User Permission Lock Down Mode feature is also copied – and then if it is enabled, or disabled at the site collection level (it’s a site collection feature, remember), it will not affect sub-sites with broken inheritance.

How can it be fixed ?

When you create a sub-site via Powershell, you need to use a slightly different method to break permissions inheritance :

$web.BreakRoleInheritance($true,$false)

This method copies the existing group assignments, but inherits the permission sets. It’s obviously the method used by the SharePoint interface, which exhibits the same behaviour.

If you have already created a number of sub-sites, they can be repaired by writing a PowerShell script to iterate through them, first reading the groups and roles assigned to them, then re-inheriting, and re-breaking permission inheritance correctly, before re-building the group and role assignments appropriately.

Posted by Jonathan Beckett in Notes, 0 comments