One of the most commonly requested public API by students has been a class/catalog API. The fact that there’s no API for class/catalog data hasn’t stopped eager student developers. They usually end up scraping data from the catalog and storing it in a database where their applications can easily access it. We are happy to say that we have started work on a solution to this common student developer problem 🙂

The API will allow developers to query classes by term, subject and course number to retrieve full class information including details, teacher, class availability and other information publicly available via the course catalog. We have started our design process in github: https://github.com/osu-mist/courses-api-design using the OpenAPI Initiative format (formerly known as Swagger). You can use the swagger editor links below to see the first draft of the design:

Let us know what you think either with a comment or pull request. This design and the first implementation of this API won’t be final. Our goal is to release a beta version of the API, and collect developer feedback.

 

Posted in API.

When the Hackathon rolled around our team wanted to bring a project that extended or utilized our current CMS for a new purpose. We wanted something that we could start and finish in a single day and we also wanted a project that would take advantage of a team with diverse skill sets.

Enter bennyslista classified advertisements website exclusive to the Oregon State community with sections devoted to buying and selling textbooks, electronics, bikes and just about anything students need. Students can create and respond to listings through the site, making it a safe, secure place to buy and sell.” taken directly from our about page.

Screen Shot 2016-05-18 at 2.15.46 PM

Bennyslist is a group within the main.oregonstate.edu drupal site. We decided to use our CMS because it allows us to tie into existing university themes and branding but it also gives us quick access to some powerful components.

CAS Authentication

Really at the core of the concept is that bennyslist is private to the OSU community. Craigslist fails because of the constant bombardment of spam and phishing attempts. Just try to sell anything and you will see for yourself, it can be an aggravating experience. Because OSU Drupal already has an authentication tool built into it we were easily able to specify which components would be “gated” via ONID login.

If you want to see what is available or want to post something for sale yourself then you will be required to use your OSU issued account. This was also nice because it did not add yet another account credential to store, memorize or update.

Webform

The second component that we used was Drupal Webforms. Our audience can go to the form and fill out what they have for sale and once submitted those entries are dynamically display in our listings. This allows them to come back and edit the posts and also means that it is fully automated. No staff time is required in moving submissions onto the for sale/trade list. However, it also provides a way for administrators to remove postings that are deemed unacceptable.

Screen Shot 2016-05-18 at 2.32.10 PM

Because of these components it was an easy decision to go with a centrally hosted CMS as opposed to trying to create our own home grown application. There are certainly some compromises to make, but in the end we were able to create a fully automated and functional tool.

The rest of the work went into marketing and communications. We had designers create attractive images, a writer help with copywriting and a video producer even had time to create a commercial for the tool. Taking a fully integrated approach to our product allowed us to deliver a completed product in the time allowed.

So go ahead and give it a try, think about how you use Drupal and OSU templates. They do have some limitations but that doesn’t mean you can’t explore different functionality and strategy for your sites.

The Team

  • Callie Newton – Web Editor and Writer
  • Oliver Day – Interactive Designer
  • Santiago Uceda – Assistant Director (Illustrator for this project)
  • Darryl Lai – Multimedia Producer
  • Kegan Sims – Drupal Architect

I run two email newsletters for the Graduate School. One delivered weekly through MailChimp and one monthly through lists.oregonstate.edu. The monthly newsletter is also posted to our Drupal website. I spend a lot of time on these newsletters and what follows is how I write and proof them. I also use these methods for copy editing blog posts and other types of editing and writing (minus the CSS inlining.) At the end, I’ll share some additional tools that I hope to incorporate in the future.

Write in Markdown

For me, the easiest way to write my newsletters is in Markdown. While Markdown is not an exact standard, there are enough services using it that it is fairly well supported across the web and within tools. Github supports it. BeeGit is a content writing and editing platform that uses it.

I use the Sublime Text 3 text editor with the Markdown Editing package. The Markdown Editing package gives you some special highlighting, a color scheme, and other niceties. It does not, however, provide an HTML preview. When first writing in Markdown, I used BeeGit for its preview and Markdown cheat sheet until I became more comfortable with the syntax.

Why do I use Markdown? My top reasons:

  • Conversion to HTML with Pandoc (more on this later)
  • I can use my text editor (goodbye Word)
  • The files are only plain text for maximum preservation value
  • Creating a link is a breeze
  • Creating a link is a breeze
  • Creating a link is a breeze

So yeah, creating links is a breeze. My newsletters take the format of blurb and link, blurb and link, etc. If I had to create each link using a WYSIWYG, I would find a new line of work.

Words to avoid, grammar and style

Some talented people have released tools that check your text for common errors beyond spelling mistakes. I use four of these.

Proselint focuses on usage, not grammar. Here’s a list of what it checks for. It is a command-line only tool at this time.

retext-mapbox-standard is a combination of language tools that checks for gendered language and potential slurs, words to avoid in educational writing, jargon, and more, plus it can read Markdown. The project is meant as an example of what organizations can do to enforce their own style guides, but I use it as provided by Mapbox. Also a command-line tool.

OSU copy cop is a tool I made that checks some of the editorial standards set forth by OSU. Saying “I made” isn’t really true: I copied it from the original Copy cop and added a few OSU things.

Grammerly is a web and desktop application that checks usage, grammar, spelling, and more. Available as a free tier and paid tier. I copy my text into it and out of it, which isn’t efficient, but gets the job done and it doesn’t complain about the Markdown I paste in. The available browser plugin also checks your text while you write posts on websites like Facebook and Twitter, which can help you avoid some embarrassing mistakes.

Pandoc to convert to HTML

Pandoc is a fantastic tool that converts between file types. Converting Markdown to HTML goes like this:

pandoc -o file-out.html file-in.md

That’s it and bam! HTML ready to go.

Add CSS to the header

For my monthly email newsletter I like to inline some of my CSS into the HTML, so before sending it through an inliner tool (see below) I have pandoc create a standalone HTML document with my CSS in the HEAD of that doc. If you give Pandoc the -H option it will grab the contents of that file and put it in the HEAD of the doc you are creating.

pandoc -o -s file-out.html file-in.md -H add-to-head.html

The add-to-head.html file looks like:

<style>
  p {
    margin-bottom: 1em;
  }
</style>

Inline the CSS

Now that my HTML doc is ready with the styles in the HEAD, I can run the whole thing through an inliner tool and it will put the styles inline with my HTML. I use Mailchimp’s inliner tool for this. I think there are some command line tools for this (like Juice) but this website makes it quick.

Add to Drupal and send email

Finally, after the inliner step, my newsletter’s HTML is ready. I go to my Drupal site and paste in the HTML. From there I copy the text and paste it directly into Gmail for sending. The result is a plain, single column newsletter layout. Here’s an example.

Future improvements

I’d like to add a HTML template system (like this or this) to my workflow so I can create better layouts for the email newsletters. For that, I’ll need to go back to an email program that allows me to edit the HTML directly, like Thunderbird, or use an email service provider. For Mailchimp, WordPress, or anywhere else I just need HTML, I follow the steps above but stop after I convert the Markdown to HTML.

— John McQueen, Web Communications, Oregon State University Graduate School

The OSU Developer Portal (https://developer.oregonstate.edu) currently has two APIs available, Directory (for people) and Location (for campus locations). This post describes my experience developing a “hello world” framework with the Location API. I got excellent reuse of the framework code when also experimenting with the Directory API.

I logged on to the OSU Developer Portal, registered a new app, and got the Consumer Key and Consumer Secret strings that are required to make calls to the API.

A Consumer Key looks something like m5l2jS54r7XqkkvJVovdpUY1o4DMl0la and a Consumer Secret like koMYj5HVd9m963a8. Not the actual ones! Get yer own!

Calling an OSU API has two steps:

  1. An HTTP POST to the getAccessToken method, with Consumer Key and Consumer Secret, to get an access token.
  2. An HTTP GET to to the desired API method, here getLocations, with the access token obtained above and any optional parameters, here a q query string.

For example, calling getLocations with the query q=kerr returns this JSON:

{
“links”:{
“self”:”https://api.oregonstate.edu/v1/locations?q=kerr&page[number]=1&page[size]=10″,
“first”:”https://api.oregonstate.edu/v1/locations?q=kerr&page[number]=1&page[size]=10″,
“last”:”https://api.oregonstate.edu/v1/locations?q=kerr&page[number]=1&page[size]=10″,
“prev”:null,
“next”:null
},
“data”:[
{
“id”:”2e9ee2d06066654f61a178560c2c137a”,
“type”:”locations”,
“attributes”:{
“name”:”Kerr Administration Building”,
“abbreviation”:”KAd”,
“latitude”:”44.5640654089″,
“longitude”:”-123.274740377″,
“summary”:”Kerr houses OSU’s top administrative offices and many services for students, including admissions, financial aid, registrar’s office and employment services. If you’d like a student-led campus tour, stop in 108 Kerr to make arrangements.”,
“description”:null,
“address”:”1500 SW Jefferson Avenue.”,
“city”:”Corvallis”,
“state”:”OR”,
“zip”:null,
“county”:null,
“telephone”:null,
“fax”:null,
“thumbnails”:[
“http://oregonstate.edu/campusmap/img/kad001.jpg”
],
“images”:[
null
],
“departments”:null,
“website”:”http://oregonstate.edu/campusmap/locations=766″,
“sqft”:null,
“calendar”:null,
“campus”:”corvallis”,
“type”:”building”,
“openHours”:{
}
},
“links”:{
“self”:”https://api.oregonstate.edu/v1/locations/2e9ee2d06066654f61a178560c2c137a”
}
}
]
}

I used the Microsoft ASP.NET Web API 2.2 Client Libraries NuGet package. This package has features to automagically deserialize JSON into plain ol’ class objects.

These are the class objects. .NET “generic” classes play an important role in the overall solution. Generic classes have a type parameter appended to their name, here <T>, where T is a naming convention used widely by .NET developers; wherever the type parameter (“T”) shows up in the class body is where the generic replacement is propagated when the type is constructed.

namespace JsonDataTransferObjects
{
  //first a few framework classes that can be reused later for other APIs...

  //root of all JSON returned by the OSU APIs.
  //note how the "links" and "data" fields 
  //  correspond with items of the same name in the JSON.
  public class RootObject<T>
  {
    public Links links { get; set; }
    public List<T> data { get; set; }
  }

  public class Links
  {
    public string self { get; set; }
    public string first { get; set; }
    public string last { get; set; }
    public string prev { get; set; }
    public string next { get; set; }
  }

  //Abstract (must inherit) class that corresponds to "id",
  // "type" and "attributes" of root object JSON
  public abstract class ObjectData<T>
  {
    public string id { get; set; }
    public string type { get; set; }
    public T attributes { get; set; }
  }

  //now classes for the solution at hand...

  //constructed type: class declared from a generic type 
  //  by supplying type arguments for its type parameters.
  //  now we will be able to work with RootObject<LocationData>,
  //  and the List in the "data" field will be of type LocationData
  public class LocationData : OregonState.Api.JsonDataTransferObjects.ObjectData<LocationAttributes>
  {
  }

  public class LocationAttributes
  {
    public string name { get; set; }
    public string abbreviation { get; set; }
    public string latitude { get; set; }
    public string longitude { get; set; }
    public string summary { get; set; }
    public string description { get; set; }
    public string address { get; set; }
    public string city { get; set; }
    public string state { get; set; }
    public string zip { get; set; }
    public string county { get; set; }
    public string telephone { get; set; }
    public string fax { get; set; }
    public List<string> thumbnails { get; set; }
    public List<string> images { get; set; }
    public object departments { get; set; }
    public string website { get; set; }
    public string sqft { get; set; }
    public string calendar { get; set; }
    public string campus { get; set; }
    public string type { get; set; }
    public LocationOpenHours openHours { get; set; }
  }

  public class LocationOpenHours
  {
    //todo: not sure what attribute name is
  }
}

Now, just enough code to get this test to pass (which it does!). Note the async and await keywords used for asynchronous programming in the .NET framework.

  [TestClass()]
  public class LocationServiceIntegrationTests {
    
      //initialize API service client, here constructed to handle the type LocationData
      private static readonly ApiServiceClient<LocationData> myServiceClient = MerthjTestApp1ApiServiceClientFactory.CreateLocationDataApiServiceClient();
  
      //gotcha: even unit tests, which normally do not return anything,
      // must now be "async" and return a "Task"
      // because of "await" keyword
      [TestMethod()]  
      async Task LocationServiceClient_Should_ReturnCorrectResultForQueryOnKerr() {
          RootObject<LocationData> result = await myServiceClient.Request("?q=kerr");
          Assert.AreEqual("Kerr Administration Building", result.data.Item[0].attributes.name);
      }
  }

Use factory pattern to construct the API client, because this is a complex process, and one that would lead to code duplication if simply NEW-ing the client in code.

public sealed class MerthjTestApp1ApiServiceClientFactory
{
	//todo: move to configuration
	// (here only because I liked to see the actual address)
	const string ApiOregonstateBaseAddressUriText = "https://api.oregonstate.edu";

	public static ApiServiceClient<LocationData> CreateLocationDataApiServiceClient()
	{
		//pass in a new LocationService, a service which makes actual calls
		// to the API (but we could also use "mock" service here...)
		//note use of Visual Studio's "My.Resources" to store Consumer Key and Consumer Secret
		//  in a resource file; resource file SHOULD NOT EVER be checked in
		//  to revision control, or your secrets are out there for all to see.
		return new ApiServiceClient<LocationData>(new LocationService(), ApiOregonstateBaseAddressUriText, "v1/locations", My.Resources.OSU_Developer_Portal_Secrets_Resource.ConsumerKey, My.Resources.OSU_Developer_Portal_Secrets_Resource.ConsumerSecret);
	}
}

An API service client is responsible for managing the access token required before each API call.

public class ApiServiceClient<T> {
    
    private readonly IApiService<T> myService;
    private readonly string myResourceRelativeUriText;
    private readonly string myBaseUriText;
    private readonly string myConsumerKey;
    private readonly string myConsumerSecret;
    
    //Lazy object is not constructed until its ".Value" method 
    // is called (a singleton) (see Me.CreateRequestAsync).
    // Not thread safe!
    private Lazy<Task<string>> myAccessToken = new Lazy<Task<string>>(new System.EventHandler(this.GetAccessTokenAsync));
    
    //no-arg constructor is private so clients may not use it
    private ApiServiceClient() {
    }
    
    //constructor required for use by clients
    public ApiServiceClient(IApiService<T> service, string baseAddressUriText, string resourceRelativeUriText, string consumerKey, string consumerSecret) {
        // todo: validate parameters
        this.myService = service;
        this.myResourceRelativeUriText = resourceRelativeUriText;
        this.myBaseUriText = baseAddressUriText;
        this.myConsumerKey = consumerKey;
        this.myConsumerSecret = consumerSecret;
    }
    
    //use the service to get the access token
    async Task<string> GetAccessTokenAsync() {
        Debug.WriteLine("Entering GetAccessTokenAsync");
        return await this.myService.GetAccessTokenAsync(this.myBaseUriText, this.myResourceRelativeUriText, this.myConsumerKey, this.myConsumerSecret);
    }
    
    async Task<ApiRequest> CreateRequestAsync(string query) {
        return new ApiRequest
        {
          //actually fetch the access token here!
          AccessToken = await this.myAccessToken.Value;
          BaseAddressUriText = this.myBaseUriText;
          ResourceRelativeUriText = this.myResourceRelativeUriText;
          Query = query;
        }
    }
    
    //clients can request API result as raw JSON string for troubleshooting...
    async Task<string> RequestAsString(string query) {
        return await this.myService.GetContentAndReadAsStringAsync(await this.CreateRequestAsync(query));
    }
    
    //most clients will use this method
    async Task<RootObject<T>> Request(string query) {
        return await this.myService.GetContentAndReadAsTypeAsync(await this.CreateRequestAsync(query));
    }
}

The API Service class is a messaging gateway, which encapsulates the API messaging code, and has strongly-typed methods. (See especially “public Task<RootObject> GetContentAndReadAsType(ApiRequest request)”.)

public abstract class ApiService<T> : IApiService<T>
{
	private static FormUrlEncodedContent CreateFormUrlEncodedContentForGetAccessToken(string consumerKey, string consumerSecret)
	{
		//Data required by OSU Get Access Token API
		var keys = {
			new KeyValuePair<string, string>("grant_type", "client_credentials"),
			new KeyValuePair<string, string>("client_id", consumerKey),
			new KeyValuePair<string, string>("client_secret", consumerSecret)
		};

		//Also sets content headers content type to "application/x-www-form-urlencoded".
		return new FormUrlEncodedContent(keys);
	}

	//"Async" declaration because we are calling "GetAsync" asynchronous method.
	//.NET compiler builds an absolutely CRAZY state machine in code behind the scenes.
	private async static Task<HttpResponseMessage> GetResponseAsync(HttpClient client, ApiRequest request)
	{
		client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", request.AccessToken);

		//Await-ing a result suspends its progress and yields control to the method that called it.
		//Could actually do other things in parallel here before we Await:
		// Dim responseTask = client.Get(request.ResourceRelativeUriText & request.Query)
		// DoIndependentWork()
		// Dim response =  await responseTask

		var response = await client.Get(request.ResourceRelativeUriText + request.Query);

		//ApiService doesn't know how to handle failed responses yet, so throw exception.
		if (!response.IsSuccessStatusCode) {
			throw new InvalidOperationException(response.ReasonPhrase);
		}

		return response;
	}

	private static HttpClient CreateHttpClient(string baseAddressUriText)
	{
		return new HttpClient { BaseAddress = new Uri(baseAddressUriText) };
	}

	public Task<string> GetAccessToken(string baseAddressUriText, string resourceRelativeUriText, string consumerKey, string consumerSecret)
	{
		//for a given resource Uri, like "v1/locations" or "v1/locations/", append segment for the access token to get "v1/locations/token"
		//todo: improve with something like http://stackoverflow.com/questions/372865/path-combine-for-urls
		var requestUri = new Uri(resourceRelativeUriText.TrimEnd('/') + "/token", UriKind.Relative);
		var content = CreateFormUrlEncodedContentForGetAccessToken(consumerKey, consumerSecret);

		//"Using" automatically and absolutely calls "Dispose" method on object at the code block.
		//Disposes any expensive and/or unmanaged resources, like an open connection stream.
		using (client = CreateHttpClient(baseAddressUriText)) {
			using (response = client.Post(requestUri, content)) {
				if (!response.IsSuccessStatusCode) {
					throw new InvalidOperationException(response.ReasonPhrase);
				}

				//Shorthand for:
				// var response = await response.Content.ReadAs(Of GetAccessTokenResponse)()
				// Return response.access_token
				return (response.Content.ReadAs<GetAccessTokenResponse>()).access_token;
			}
		}
	}

	//Read as string to support troubleshooting and curiosity.
	//Most clients should use the other method to "read as type," which reads data into strongly-typed classes
	public Task<string> GetContentAndReadAsString(ApiRequest request)
	{
		using (client = CreateHttpClient(request.BaseAddressUriText)) {
			using (response = GetResponseAsync(client, request)) {
				return response.Content.ReadAsString();
			}
		}
	}

	public Task<RootObject<T>> GetContentAndReadAsType(ApiRequest request)
	{
		using (client = CreateHttpClient(request.BaseAddressUriText)) {
			using (response = GetResponseAsync(client, request)) {
				//The magic deserialization of JSON
				// into .NET classes happens in this one line...
				//Declaration of ReadAs is:
				//  Public Shared Function ReadAs(Of T)(content As System.Net.Http.HttpContent) As System.Threading.Tasks.Task(Of T)
				//This class (ApiService) is generic so we can take advantage of ReadAs being generic.
				return response.Content.ReadAs<RootObject<T>>();
			}
		}
	}
}

And finally, the Location Service class inherits from API Service with the type parameter LocationData.

//Inherits ApiService, so we get its interface and functionality for free.
public class LocationService : ApiService<LocationData>
{
}

(This WordPress blog did not seem to have syntax highlighting for Visual Basic. So, I translated my existing code from VB.NET to C# using several online translators. There are bound to be errors introduced in the translation process. I know, I know. It’s not “cool” to write in VB… the poor under-appreciated language that can do everything–and more!–its currently-hip sibling C# can do. But I have more than 17 years of VB development under my belt, so not switching now just to be cool. I prefer to deliver business value rather than struggle with silly braces and semicolons. 🙂

Posted in API.

During our original exploration of APIs, we began by learning about the current space. We read books, watched webinars and used Gartner guides. One question that we couldn’t find an answer to, was what are other Universities doing with APIs? This question lead to the beginning of the API survey. The results below were shared with the ITANA mailing list in a raw format.

The list of questions was developed in collaboration with members from ITANA (a group of IT architects from Higher Ed). Due to the high number of questions, some questions were not included in the final version of the survey. In order to give more freedom to participants, the survey didn’t ask for emails or names. These results were shared with the API subgroup of ITANA via the mailing list. We are now putting the results in a blogpost to make it easier to discover in the future. If you have any questions about the survey format, design or questions, drop us a note in the comments section below.

1. What is the name of your higher education institution?
* University of Michigan
* Simon Fraser University
* University of Washington
* Virginia Tech
* The University of Toledo
* Northwestern University
* George Mason University
* Columbia University
* UMUC
* University of Michigan
* University of Toronto
* University Of Chicago
* Brigham Young University
* University of Wisconsin – Madison
* University of Michigan
* University of Chicago
* Johnson & Wales University
* Yale
* Minnesota State Colleges & Universities
* University of California at San Diego
* Oregon State University
* Yale University

2. What is the enrollment size of your higher education institution?

Minimum: 5,000
Max: 140,000
Average: 38,000

3. Is your higher education institution currently working on Web Services, Service Oriented Architecture?

AnswerResponses
Yes18
No4

4. What is the FTE size of the team?

Responses
1
2 central IT teams: 7 operating suite of REST Web Services; 10+ building new Enterprise Integration Platform (EIP)
3
6
9
3
2.5
5
2 FTE - infrastructure. Many FTE developers who work with web services and soa, probably over 20 known to me.
4
6
12
5
2
7

5. Is there one central department working on this effort or multiple departments?

AnswerResponse
A single central department7
Multiple departments8

6. Was this initiative setup from top management or by a small group(s) / department(s)?

AnswerResponse
A single central department7
Multiple departments8

7. What technologies are you using?

AnswerResponses
Enterprise Service Bus (ESB)7
API Gateway10
RPC0
REST14
SOAP8
Other4

Other responses included:
* AWS SNS/SQS
* Services Registry, Message Broker
* Custom API’s, Vendor App API’s
* JMS

8. What technologies are you trying to phase out, if any?
* SOAP, XML/RPC
* migrating from custom .NET REST APIs to EIP with API Management
* point-to-point integrations, batch data data transfers, database links
* PeopleSoft customizations
* Direct DB connections
* Batch download
* Looking to replace custom API codebases with iPaaS solution
* SOAP
* custom middleware
* Hub-based Webmethods

9. Are these web services / APIs accessed by internal departments / groups within your higher ed institution, or external 3rd party vendors?

AnswersResponses
Internal departments15
3rd party vendors10

10. Where do you publish the list of web services / APIs available?
* Still in its infancy, but currently at http://www.sfu.ca/data-hub.html
* http://webservices.washington.edu/ [this is the current, custom Web Services Registry … will be migrated to an API Management tool]
* Intranet
* Services Registry
* Internal wiki
* API Manager Application
* The intent is to simply use a web page
* No central place yet. This would part of benefit of a new iPaaS that has solid API management.
* n/a
* API Manager
* https://developers.yale.edu/
* Currently don’t have a good inventory. Looking to publish a list using an API Management Service
* Planning to publish using api manager
* not published online yet

11. What’s the URL of your web services / API / SOA documentation?
* http://www.sfu.ca/data-hub/api.html
* http://webservices.washington.edu/ [also extensive UW-centric documentation in Confluence wiki sites]
* Intranet
* https://serviceregistry.northwestern.edu
* Not accessible outside
* developer.it.umich.edu
* not yet available
* No central place yet. This would part of benefit of a new iPaaS that has solid API management.
* n/a
* Not available yet
* https://developers.yale.edu/
* N/A
* Documentation is not published yet
* not published online yet

12. What is the development stack used for developing SOA / API / web services?

AnswerResponse
Java / jvm14
Node4
Python3
.NET6
Ruby4
Other5

Others:
* and many others
* PeopleTools, WSO2 Data Services
* Custom PHP.
* PHP and Perl
* JavaScript, iOS, Android

13. What are the primary benefits you are seeing from your API strategy?
* None yet, as it’s still in its infancy. The goal is to open up SFU data and encourage developers to consume it. The classic example is the mobile app. There are currently several in the Apple app store that rely on screen scraping to get the job done. We’d like to see that go away and encourage good development by students. This could translate into a better reputation for the university as a leading edge institute.
* close relationship between data management initiatives/governance and our ROA (Resource-Oriented Architecture) Web Services has made the governance of Web Services easier than it might have been otherwise. Also, maintenance has been easier since the number of Web Services roughly equals the number of data domains (a handful) with several endpoints per services which roughly equate to primary data tables (e..g. student). No specialized development to deliver only certain data to certain clients. Biggest benefit may be the cumulative effect on IT culture, that developers now expect there to be APIs for data.
* Promote reuse, easier to maintain
* We are early in the process, but we are seeing some benefits in enabling consumption of identity data and in the integration of cloud-based systems with our on-premise systems.
* Ability of central IT to enable others to get done what they need to. Ability to swap in modern systems of record for legacy systems.
* Re-use of services. Changing integration patterns of copying data locally.
* We’ve switched the strategy from IBM’s MQ and SOAP to REST-style Web API’s secured with OAuth 2.0 access tokens, and have seen much improved interest from the developers in the Divisions. Two Divisions have started developing applications to use the services
* More modern and sustainable integrations. Data transparency and opportunities for distributed app development around the data.
* Lower cost of adoption for new customers. Centralized and consistent security model. Well defined data models have helped to define better APIs.
* Development of mobile applications
* Metrics on usage and types of applications using the data
* Hoping to solve integration challenges. Increased security versus direct database connections.
* Reusability, De-coupling database Discoverability
* Centralizing access to data. Having conversations with people to come up with a consensus to describe data models. Developing one location where developers on campus can go to request access to data and view documentation
* Normalized and consistent abstraction layer to institutional data.

14. What are the primary challenges you’ve seen and are running into with your SOA / API strategy?
* No budget.
* without a strong executive mandate (ala Bezos at Amazon), adoption velocity is slow, especially with established applications that already have privileged access to enterprise administrative data and don’t need to re-invest in a SOA approach. Most success with non-central IT where such privileges don’t exist and with disruptive forces such as new SAAS vendors where data is not easy to get without a SOA approach. Another challenge is the push-back from client developers on our purely RESTful strategy. They often want data preassembled from several REST resources and delivered via a single API call instead doing the assembly themselves. The new EIP will facilitate this requirement.
* Convince developers and show benefits to management
* The ability of the community to ramp up and develop the skill sets necessary to expose services and consume them. We are also having issues with the amount of time it takes for data stewards to approve requests to consume services.
* Unbundling and rebundling complex logic in a new way.
* Everyone wants to consume APIs, nobody wants to contribute.
* The problem with MQ and SOAP was the learning curve for the Divisional developers – they simply didn’t have the time to figure out the details. PHP integration with MQ proved to be a challenge too.
* Prioritization. Funding. Technical debt.
* Early adoption was slow. Skill sets required to be productive are hard to acquire, which in turn slows down the amount of time until a staff memeber can become productive. No centralized documentastion or API gateway for all services to be discovered.
* Resourcing, knowledge, disagreement over approach,
* Governance around data, security, org magement
* Service governance. Getting infrastructure in place.
* Using unproven technology Changing the mindset of people who might be used to doing things in certain way Security, specifically authorization
* There’s an education component of bringing people up to speed with APIs and how to use them. Some people don’t like change and feel that they have less control when they don’t have a local cache of the data.
* Adoption, documentation, technical ability

15. Would you describe your APIs as microservices?

AnswerResponse
Yes3
No (explain below)6
I don't know6

No, explained:
* primarily implemented GET functionality which is by nature pretty chunky. Our Web services provide data between apps but don’t encapsulate business functionality except in limited cases. True microservices architecture would require a complete rearchitecture that accounted for eventual consistency and allowed for states of data not currently allowed
* We’re starting to adopt the micorservice model, but at the moment we have a single “student record” service that returns 18 different entities.
* full fledged apis

16. If you have not yet started to work on SOA / API / web services, are you planning to do that in the future?

AnswerResponse
Yes4
No1
I don't know1

Note: not all survey participants were presented with this question. Only the ones that previously answered no to question 3.

17. Number of calls per minute for most active web service / API
* N/A
* 100
* 20
* < 1
* 1000
* 200
* Too early in the process to tell.
* N/A
* 3k per minute
* ? – in thousands for the hospital
* NA/Don't know
* N/A
* not live yet

* N/A
* 100
* 20
* < 1
* 1000
* 200
* Too early in the process to tell.
* N/A
* 3k per minute
* ? – in thousands for the hospital
* NA/Don't know
* N/A
* not live yet

18. Number of web services / APIs available
* 5
* 12 each with several different resources
* 20
* 17
* 5-10
* 30 in our API Gateway
* currently 1 service – will be separated into between 10 to 16 micro services
* <10
* 24 APIs
* ~50
* 10
* Less than 12
* less than 15
* not live yet

19. Number of applications using these web services / APIs
* N/A
* 40-60
* 1
* 14
* 10-15
* 300+ Many are student applications
* Two planned for now.
* <10
* 83
* ?
* 10-20
* 5-10
* not live yet

20. Number of departments / organizations using these web services / APIs
* N/A
* 10-20
* 1
* 6
* 3
* Don’t Know
* Two planned for now
* Just internal to IT at this point.
* >25
* ?
* 10-20
* 6-7
* not live yet

21. How much advance notice before API / web service retirement do you provide to your users?
* We anticipate being able to give 1 year notice, but also plan to use API versioning to allow for multiple versions concurrently
* 6 months minimum
* 30 days
* We have allowed the provider to determine that, but our expectation is that it will be at least 18 months.
* NA
* We haven’t retired any services as yet, but we would be expected to provide as much notice as possible because Divisions may not have the resources available to change their consumers.
* I don’t know.
* 4 weeks for production, variable for test environment based on potential affects.
* Not at this level – more adhoc – looking at api manager to support
* Once a web service endpoint is published, it is very difficult to retire.
* N/A
* two years is what we plan to provide

22. What is the granularity of your API versioning?

AnswerResponse
Single object / resource (e.g.: example.com/api/students/v1/)1
Collection of objects / resources (e.g.: example.com/api/v1/sudents/)8
I don't know3
Other1

23. What versioning scheme do your APIs use?

AnswerResponse
URL: /api/v110
Query parameter: ?v=1.01
HTTP header3
Other3

24. From where do you serve query responses? (multiple choice)

AnswerResponse
Source database14
Intermediate data store / db6
Cache4
Operational Data Store or Data Warehouse6
Other2

Other: code, LDAP

25. What data formats are used by your SOA / Web Services / API layer?

AnswerResponse
JSON14
XML10
CSV1
Other1

Other: xhtml

26. Which one of these hypermedia formats / types do you use?

AnswerResponse
Siren0
JSON-LD2
HAL0
Other2

27. How were the data models in your SOA / APIs (representation of data objects e.g.: course, student, event) defined?

AnswerResponse
Single department5
Group of departments6
Data governance5
Other3

Other:
* collaborative REST design sessions with as many stakeholders involved as possible
* We have to use existing system of record data models.
* Still need to define those

28. Are your data models a direct representation of your database tables / schema?

AnswerResponse
Yes5
No9
I don't know0

29. Do you have a data governance initiative?

AnswerResponse
Yes10
No4
I don't know0

32. Do you use any tools to automatically convert a db schema / db tables to web service, API or microservice?

AnswerResponse
Yes2
No12
I don't know0

34. Does your higher ed institution have a development portal to onboard new developers, including: list of APIs, web services or misc. resources for developers?

AnswerResponse
Yes5
No16
I don't know2

36. What type of authentication do you use (e.g., LDAP, SAML, social login,etc) with your development portal?

AnswerResponse
LDAP3
SAML3
Social login0
Other3

Other:
* CAS
* CAS
* Basic Auth

37. Is the development portal open to any of the following? (multiple choice)

AnswerResponse
Staff5
Students
Faculty4
3rd party developers1

38. What software / technology do you use for the development portal?
* WSO2 API Gateway
* Jive
* WSO2
* WSO2 currently also looking at other solutions

39. Do you use an API Gateway / Management Layer?

AnswerResponse
Yes8
No5
I don't know0

46. What types of authentication are required by your higher ed institution to make API / web services calls?
* Public ones: none but will move to access tokens. Private ones: OAuth (still in development)
* UW administered tokens, X509 certs
* ADFS, CAS, Basic Authentication
* set of credentials similar to NetID/password
* Basic Auth, OAuth, CAS, other
* Oauth2.0, WS-Security
* OAuth 2.0 Client Credentials flow – because the users don’t own the data, the University Registrar does. We will consider other flows when the user is actually a Resource Owner.
* API keys, BasicAuth
* WS-Security username token, sometimes client certificates
* jwt
* Username/password, firewall rules
* Oauth
* tokens

47. What types of authentication have you used in the past, that you phased out?
* client certs for very limited SOAP calls
* NA
* basic auth
* none
* n/a
* n/a
* N/A
* ldap
* none

48. Are the authentication tokens, api keys or other authentication methods specific to the application making the request or the user of the application?

AnswerResponse
Specific to the application9
Specific to the user of the application3
I don't know0
Other1

49. What types of security policies do you have in place for making API / web services calls?
* None yet, but planning to move APIs behind the API gateway, use access tokens for all calls, and enforce throttling on public APIs and OAuth for private APIs
* use a home-grown permissions system ASTRA to manage what resources an application can access; applications are assigned roles just like people
* Applications must obtain claims before apps are authorized to use APIs
* In addition to application credentials, the ESB checks with the services registry if the application has approval to use the service. We check the IP addresses of external consumers.
* Depends, some are open, some are highly secured.
* The application making the calls must be assigned a UTORid (the primary identity credential used at UofT). This credential must be used to obtain an OAuth access token, which is then included with API call. There is a policy enforcement point that validates the access tokens. The user’s UTORid is included in the request so that the API container (WebSphere Application Server) can perform authorization.
* IP restricted when possible. Registered API key required.
* AuthNZ at the ESB level to determine who the requester is an dwhether or not they can call the method. At the functional web service layer, a lot of service providers will ask the question what data is returned in the request.
* Under development
* We are in the early stages of defining policies for next generation of APIs.
* It depends on the application
* none yet

50. Do you cache your API responses?

AnswerResponse
Yes8
No4
I don't know1

52. How do you handle communication with developers who rely on your APIs, web services regarding upcoming features?

AnswerResponse
Email11
Blog3
Forums2
Other3

Other:
* Yammer
* Personal interaction
* wiki pages

53. What support mechanisms are used by your higher ed institution to provide support of your web services?

AnswerResponse
Ticket system13
Forums0
Other2

Other:
* UserVoice
* Personal interaction

54. What has been your strategy for moving away from bulk data feeds?
* This is a struggle and one we hope to entice people away from with this new API service. That said, private data exchanges (e.g. between our ERP and our meta directory) likely won’t go through the API gateway, but will (we hope) still migrate to an ESB model and away from nightly data dumps.
* still evolving; in fact our new EIP will support bulk feeds as a standard interface for those clients that need/want them. What won’t be supported is direct connection to underlying schema
* We are investigating SOA in a small POC and from there hope to build consensus around moving forward with a larger deployment, institute a governance process, etc.
* Slowly moving away, provide education and training
* We have not settled on a single strategy. Right now, it is mostly in the form of encouragement. At this time, no pre-existing bulk data feeds have been converted. We have architected services into new projects where we would have used bulk data feeds in the past.
* None currently. We have had individual efforts where such a change was suggested or recommended, but the only data transmission method currently used is file transfers.
* Developing ESB, SOA, APIs is the goal but we haven’t started yet. Hope to start a small pilot this coming year.
* Demands that are more real-time in nature have naturally moved us away, but we do still use them sometimes – in fact, we sometimes use API calls to produce the data feeds.
* Many bulk data feeds are used to replicate data into local databases. Local databases are prefered because they are thought to be the only way to provide reliable and timely data to local applications. We are trying to change this pattern by providing APIs.
* Initial approach was to replace batch downloads with real-time transactional messages. We found that the benefit was minimal until the academic or administrative process was changed to accommodate real-time transactions, then the benefit was substantial. However, very few Divisions are ready to change processes even if the benefits were obvious. It needs time for the administrators to think in real-time transactions rather than daily/weekly batch downloads.
* As opportunities arise in existing projects whose contraints allow it, vs. an initiative in of itself.
* We have been suggesting that our high volume bulk curricular data applications use a new service we have for delivering roster changes to JMS queues in an asynchronous manner. We are also delivering HR data to our UW System customers via our centralized HR system to UW System HR systems for local provisioning. We still have some large pull customers that use SOAP services to refresh their local databases, but work is underway to enhance what we can deliver asychrously
* Replace with services
* Still in the early stages
* Show the value of real time data and/or events
* Start by tackling only new work and integrations. Add desirable features to APIs.

56. How was the communication to the developers handled throughout the migration away from bulk data feeds?
* We’ll see.
* hasn’t happened yet
* We’ve not gotten that far yet.
* Provide education and training
* N/a
* Through architecture engagements with individual projects.
* The problem was we started with the developers – we should have started with the senior administrators by describing the benefits to their processes and increase in data currency…and not mention technology at all!
* Ad hoc, project by project.
* Email and meetings
* This is still work in progress.
* N/A
* Since API initiative is very new, we are hoping to have results available in next 3-4 years
* not applicable yet.

57. What are the top 3 SOA / API problems you’re trying to solve?
* – Get an operational gateway in service – Retire SOAP and associated legacy support applications – Move to real-time exchange of information between systems
* Top 7: integration with metadata management strategy and tools; granular data element level security; increased velocity building new APIs; better managed application management; cross-domain APIs; highly-performant search; produce more events for client apps
* 1. How to do it (Authentication, stumbling blocks, good default design patterns to champion). 2. How to illustrate the value of this effort to upper management to allocate funds. 3. How to navigate the political waters
* Easier access management; Reduce development complexity; Fit for cloud and mobile first strategy
* (1) Adoption (2) Slowness in data steward approvals (3) Strengthening security
* Getting interest/ buy-in from the web/portal team Training/obtaining staff with the skills to facilitate this move – Analysts and developers with actual API and SOA experience Lack of understanding/ prioritization from IT executives around this importance of this architectural change.
* 3) simplification/feed elimination 2) timely data 1) flexibility for apps mash-ups, mobile, etc.
* Improve speed of innovation Migrate away from legacy systems Improve user experience
* Getting enough re-usable content in our API Directory Providing training to developers Changing the culture to make exposing data to our community of developers an important deliverable for projects
* secure real-time access to student data in the System of Record, rather than stale local copies near real-time synchronization of identity management repositories –
* Exposure or data to innovators around campus. More modern, real-time, sustainable integrations. Overall efficiency, maximizing reuse instead of duplicated bulk jobs.
* API management Continuous integration within our SOA environments Oauth or Identity Server provided credentials for client side javascript that would access our servers from client side Javascript.
* Improve integration with campus and 3rd party systems More secure access to data by campuses and 3rd party systems Promote re-use of business functions Promote innovative uses of
* Authorization Canonical data model API Governance
* RFP for API Gateway Automation Developing APIs that make business sense

58. What’s your higher ed institution’s 5-year plan for SOA / API / Web Services?
* Get a production service rolled out that is actively used by the university community. Hopefully secure budget to manage it one day.
* Build an Enterprise Integration Platform to replace ODS and to feed EDW and migrate existing Web Services to to be hosted from EIP; Govern EIP holistically as part of larger data management initiative
* To say we have a 5-year plan for Web Services is to give us more credit for planning than I believe we deserve at the moment. Right now we’re trying to build momentum and doing so in a guerilla fashion as part of smaller projects in the hopes we can scale this effort in the future.
* Moving away from batch processes to ESB and service automation
* More adoption
* Last time I spoke with the teams about it, no plan existed other than ad-hoc development as required by selected projects. And those projects are generally rejected due to lack of supportability.
* ESB, RESTful APIs
* Not there yet.
* Don’t Know
* We’re waiting to see what our experience is with our current approach (REST/OAuth 2.0) and if successful, focus on API management (we’ll probably buy a product).
* I don’t know.
* Build an Integration Center of Excellence for our campus.
* Still in planning and proof-of-concept phase. Evaluating cloud API management solutions.
* Have around 100+ apis
* Start by centrally providing APIs Develop a foundation such as: docs, testing, infrastructure that others can leverage Get other departments on campus to develop APIs using best practices provided centrally In 3 years, we’d like APIs to be the main method of transferring data between departments.

59. What are your areas and topics of interest for SOA?
* API Gateway deployment, currently
* security metadata
* We would like to see it in action at a larger scale and the effect it has had on productivity. We would be interested in discussing architecture and infrastructure decisions to learn from those. At this time we are primarily leaning toward an ESB-based platform offering web services, primarily RESTful, as our approach.
* ERP system integration and mobile apps
* (1) OAuth 2 (2) Adoption strategies (3) API Management Tools (4) Approval workflows and automation of approval workflows (5) Strategies for working effectively with data stewards (6) Documentation tools
* Real time data exchange. Improved accuracy of data Improved security architecture
* API sharing for common things that we do in higher ed.
* Is there an opportunity to define a common API for all universities to implement? http://edutechnica.com/2015/06/09/flipping-the-model-the-campus-api/ (I’m george.kroner@umuc.edu)
* Governance, governance, governance.
* ESB to iPaaS change in the higher ed market. Empowering innovative developers with data. Establishment of shared APIs as a path to more sane, governed, transparent, sustainable integration landscape in IT.
* REST, APIs, Authentication, Authorization, Organizational
* Already listed on ITANA site
* developing APIs speeding up the process through automation

Posted in API.

Varnish is a robust caching web service used by many high profile and high traffic websites. Acquia uses Varnish to help end users retrieve web sites faster and to help keep the load down on your servers. Once a page is in the cache performance will be fast, but what if you need to make a quick content adjustment? Now that the cached content is no longer up to date and needs to be cleaned, how can you keep your cache fresh?

What we did for our Drupal site is setup Cache Expiration with Purge modules to connect to our Varnish nodes and keep all our pages up to date. Most module like Varnish and Purge will integrate some page cleaning, but these modules don’t work for cleaning Views or Panels caches. What Cache Expiration offers is a more configurable options to act on and utilize the Workflow module to update our Views. With this extra control we were more comfortable to extend our Caching time beyond what Drupal allows to you select from the pull down to 7 days. With a quick config line change in sites.php we can setup the cache time

$conf['page_cache_maximum_age'] = 84600*7;

Now that we have our pages setting the cache header to 7 days, we need to setup one of the many different Purge options that Cache Expiration offers. After install Cache Expiration go to admin/config/system/expire and select External Expiration as we are wanting to connect to our external cache server. If you are going to use Acquia Purge or Varnish uncheck “Include Base URL in Expires”. The only sections I really needed to worry about were Node Expiration and Menu Links Expiration as we don’t use Comments and our User pages are not their Drupal user account pages. For files we use Apache to send file type headers for how long the cache server can hold onto those and all of the file uploads will append _[0-9] to the file name. For Node Expiration we set all three actions Insert, Update, Delete to trigger a page cache purge. Depending on your setup you might be fine with the basic Front Page and Node Page being purged but I wanted to do something al little different and did Custom with these two URLS

http://[site:url-brief]/node/[node:nid]
http://[node:url:brief]

The next setting that we setup was the Menu Links Expiration section. Again all actions are checked. For menu Depth, it really depends on your menu structure and what menus you have content on that will be updated more frequently, for us Main Menu with depth of 1 was all I needed.

Definition of what Menu Depth is from the maintainer:

The goal is to easily arrange for a high visibility menu to be consistent and current on all the pages the menu links to. So, if you are using a menu block with a depth of 2, you can configure this plugin to clear the URLs linked to by said menu block.

Now that we have all our nodes setup to purge when content is change as well as our menus its time to move on to our Views. The way we setup our rules to purge our Views is One rule per View, so for Feature Story we have one rule that will purge the Home page and our All Stories page.

Create a new Rule and use two events as the trigger, After Saving New content and After Updating Existing Content now with these events we filtered them by a specific content type so that way the rule will only fire for this specific pice of content, for this example I will use Feature Story as the content type. We didn’t need any conditions so that was left as None. Finally down to the part that matters, the Actions. Add a new action Clear URL(s) from the Page Cache. As soon as you select this action the page will update and present you with a text box to enter in your different URL(S) you wish to purge. To continue with the Feature Story example, our site only had one URL that needed to be purge

http://[site:url-brief]/feature-story

Save and that’s it. Any time someone updates a Feature Story on your site, the Node Page and the View’s page will now be wiped from the Cache server and the next request to those pages will now cache the newest content.

For the home page I have a special rule to purge it from the cache anytime anything is updated on the site, you might want to make modifications to this to suite your needs.


 

Modules:

Cache Expiration: https://www.drupal.org/project/expire

Purge: https://www.drupal.org/project/purge

Workflow: https://www.drupal.org/project/workflow

Varnish: https://www.drupal.org/project/varnish

Acquia Purge: https://www.drupal.org/project/acquia_purge

This article summarizes the current security solutions for  Docker containers. The solutions in this blog post have been discussed and designed by the Docker community. You can also find valuable tips on how to enhance security while running a Docker in a production environment.

Possible Security Issues in a container-based environment

Before we jump into the security solutions, let’s explore some security issues of container-based systems. Generally speaking, there are three types of attack models, which are caused by the vulnerabilities of the container-based systems.

Types of Attacks:

  • Container compromise: result in illegitimate data access and affect control flow of instructions
  • DoS(Deny of Services): disturb normal operation of the host or other container
  • Privilege escalation: obtain a privilege which is not originally granted to the container

Disclosed Vulnerabilities:

  • Namespacing Issues -Docker containers utilize Kernel namespaces to provide a certain level of isolation. However, not all resources are namespaced:
    • UID: Causing “root” user vulnerability
    • Kernel keyring: containers running with a user of the same UID will have access to the same keys if they are handled by kernel keyring
    • Kernel & its modules: Loaded modules become available across all containers and the host
    • Devices: includes disk drives, sound-cards,GPU, etc.
    • System time: The SYSTEM_TIME capability is disabled by default, but if it’s enabled, we will need to worry about it.
  •  Kernel Exploit – Container-based applications share the same host kernel, namely, flaws in  the host kernel might allow malicious containers to escape and gain access over the over whole system.
  • DoS Attacks – Since all containers share kernel resources, if a container or user consumes too much capacity of a certain resource, it will starve out other containers on the host.
  • Container Breakout – Because users are not namespaced, any process that breaks out of the container will have the same privileges on the host as it did in the container. For example, if you were root in the container, you will be root on the host. It’s a typical privilege escalation attack , unlikely to happen, but possible.
  • Poisoned Images – It’s possible for attackers to modify/embed malicious programs into the image and trick users to download such corrupt images
  • Compromising secrets – Applications need credentials to access databases or backend services. An attacker who can get access to these credentials will also have the same access as the application. This problem becomes more acute in a microservice architecture in which containers are constantly stopping and starting.

Current Solutions:

Now Let’s take a look at what security solutions that come with the current Docker implementation and what strategies or techniques can be used in production.

Least Privileges

One of the most important principles to achieve container security is Least Privileges: each process and container should run with the minimum set of access rights and resources it needs to perform its function. This includes the actions to reduce the capabilities of containers:

–  Do not run processes in a container as root to avoid root access from attackers.

–  Run filesystems as read-only so that attackers can not overwrite data or save malicious scripts to file.

–  Cut down the kernel calls that a container can make to reduce the potential attack surface.

–  Limit the resources that a container can use

This Least Privileges approach reduces the possibility that an attacker can access or exploit data or resources via a compromised container.

Internal Security Solutions

Containers can leverage the Linux Namespace and Control group to provide a certain level of isolation and resource limitation.

Namespace

Docker provides process, filesystem, device, IPC and network isolations by using the related namespace.

  • Process Isolation: Docker utilizes PID namespace to separate container processes from the host as well as other containers, so that processes in a container can’t observe or do anything to the other processes running in the host or in other containers.
  • Filesystem Isolation: Use mount namespace to ensure that for each mount space, a container only have impact inside the container.
  • Device Isolation: The container cannot access to any devices unless it’s privileged.
  • IPC Isolation: Utilize IPC namespace to prevent the processes in a container from interfering with those in other containers.
  • Network Isolation: Use network namespace so that each container has its own IP address, IP routing tables, network device, etc.

Control Group

Docker employs Cgroup to control the amount of resources, such as CPU, memory, and disk I/O, that a container can use. Under this control, each container is guaranteed a fair share of the resources but preventing from consuming all of the available resources.

Linux Kernel Security Systems

The kernel security system is present to harden the security of a Linux host system. We can also use them to secure the host from containers.

By default, containers disable a large amount of Linux capabilities from its containers in order to prevent an attacker to damage the host system when a container is compromised. And it also allows configuration of capabilities that a container can use.

Linux Security Module (LSM)

Two most popular LSM will be AppArmor and SELinux:

  • SELinux is a labeling system, that implements Mandatory access control using labels. Every object, such as process, file/directory, network ports, devices, etc, has a label. Rules are put in place to control the access to objects.
  • AppArmor is a security enhancement model to Linux-based on Mandatory Access Control like SELinux. It permits the administrator to load a security profile into each program, which limits the capabilities of the program.

Another Approach

Seccomp

The Linux seccomp (secure computing mode) facility can be used to restrict the system calls that can be made by a process. namely, containers can be locked down to a specified set of system calls.

In Production

When running Docker in a production environment, you will want to leverage one of the security solutions listed above and apply proper precautions to provide a more secure and robust system. There are three major security tips in to keep in mind when running Docker in production.

Segregate Containers by Host

The main reason to place each user on a separate Docker Host is to minimize the loss when container breakout happens. If multiple users are sharing one host, if a user monopolizes all the memory on the host, it will starve out other users. Even worse, if container breakouts happen, a user could possibly gain access to another users’ containers or data through the compromised container.

Therefore, although this approach is less efficient than sharing hosts between users and will result in a higher number of VMs and/or machines than reusing hosts, it’s important for security.

Another similar solution would be separate containers with sensitive  information from less-sensitive ones for the similar reason.

Applying Updates

Just like what is recommended for Windows system, it’s recommended to apply updates regularly. This includes updating base images and dependent images to fix the vulnerabilities in common utilities and framework. At times, we need to update Docker daemon to gain access to new feature, security patches or bug fixes. Removing unsupported drivers is also important, because those could be a security risk since they won’t be receiving the same attention and updates as other parts of Docker.

Image Provenance

To safely use images, you need to have guarantees about their provenance:

  • where they came from
  • who created them
  • ensure you are getting the exactly the image you want

There are three solutions  for image provenance: secure hash, secure signing and verification infrastructure and use Dockerfile properly.

  • Secure Hash:  Secure Hash is like a fingerprint for data. It’s a small string that is unique to given data. If you have a secure hash for some data and the data itself, you can recalculate the hash for the data then compare.  In docker, it’s called docker digest, a SHA-256 hash of a filesystem layer or manifest (a metadata file describing the parts of an image, containing a list of constituent layer identified by digest)
  • Secure Signing and Verification Infrastructure:  Data could be changed / copied if it travels over unsecure channels (e.g. HTTP), so we need to ensure we are publishing and accessing content using secure protocols. Notary project is an ongoing secure signing and verification infrastructure project in docker, which compares a checksum for a downloaded file with the checksum in Notary’s trusted collection for the file source (e.g. docker.com). For more details, please check https://github.com/docker/notary
  • Dockerfile:  Not as we expected, dockerfile is likely to produce different images over time, so as time goes, it’s hard to be sure what is in your images. To use docker properly, you would:
    • Always specify a tag in FROM instruction, and use digest to pull the exactly same image each time
    • Provide version numbers when installing software from package managers. However, since package dependencies can change over time, sometime we need to use tools (e.g. aptly) to take a snapshot of the repository
    • Verify any software or data downloaded from the internet by using checksums or cryptographic signatures.

 

This blog post is a glance of the current security solutions for docker containers, if you are interested, please refer to the reference articles for more details. Are you using Docker in production? Have you implemented some of these security models?

References

[1] Analysis of Docker Security

[2] Docker Security – Using Containers Safely in Production

[3] Docker Doc – Docker Security

Web Application Programming Interfaces (APIs) allow us to share data between systems while preventing leaking of low level details that would otherwise cause tight-coupling between systems. These APIs are just like any application, with the small difference that they don’t have an end-user GUI. Instead, APIs focus on gathering data from backend(s) and performing operations on this data while providing a standard and consistent interface to these operations. APIs have the same need as regular applications when it comes to iterative planning/design and user feedback. This is where swagger (OpenAPI Specification) comes in.

The Swagger design language work started in 2010 as a framework to document and describe Web APIs. On January 1, 2016, it was renamed to OpenAPI Specification. This rename was part of converting the Swagger project into one of the Linux Foundation Collaboration projects, which have more involvement from vendors and community on direction of the toolset and the design language.

OpenAPI Specification allows us to use json or yaml to describe our web API endpoints (urls), their parameters, response body and error codes. Before the OpenAPI Specification existed people would use text files, word or other non-web API friendly formats to document their APIs.

When OSU began our API development efforts, we wanted to have a communication and feedback cycle with OSU developers. Using the OpenAPI Specification (swagger), we can use a tool such as the swagger editor (http://editor.swagger.io/#/) and make changes to the documentation of the API in real time while we talk to developers on campus. This allows us to make changes to the visible documentation of an API without having to implement it or spend a lot of time developing a separate structure or document.  We can make a change directly to the yaml file, which is faster than having to adjust already implemented APIs.

Information Services researched a variety of tools to describe APIs. We looked at: OpenAPI Specification, RAML (http://raml.org/) , API Blueprint (https://apiblueprint.org/) and I/O Docs (https://github.com/mashery/iodocs). At first, from a technical perspective, RAML was the most attractive design language when we compared it to OpenAPI Specification, but version 2 of OpenAPI specification addressed the v1 downsides. OpenAPI specification had the greatest user base with a huge community of developers online and along with that vendor support and OpenSource tools/frameworks that supported it.

The benefits of OpenAPI Specification are:

  • Online editor – provides a wysiwyg for the API. Easy to make changes and see the output.
    swagger editor
  • Mock server – you can describe your API and have a mock/test server endpoint that returns test data. This is helpful when testing APIs.
    server generation screenshot
  • Client code – sample code that can be used to test APIs and use the APIs in a variety of languages.
    client generation code
  • Vendor/OSS support – a variety of open source tools, frameworks and vendor offerings that work with OpenAPI Specification made it the de facto language to document APIs.

Our API development cycle is:

  1. Talk to stake-holders and data owners.
  2. Design API (using OpenAPI Specification).
  3. Collect Feedback.
  4. Implement.
  5. Release as Beta & collect feedback.
  6. Release to Production.
  7. Go back to first step.

These steps are similar to the application development cycle. The key component of our API development is listening to our community of developers. The APIs are built for developers and using OpenAPI Specification to design the API initially with the developers in mind allows us to collect feedback right away and early-on. Before we start implementation of an API, we have a really good idea of what the developers need and the design has been validated by API consumers (OSU developers), stake-holders and data owners.

Our API source code is hosted in github and the OpenAPI Specification is treated just like code and is included along with the source code. We treat this documentation just like we do code and documentation. The API Gateway that we use (apigee.com) allows us to upload our OpenAPI Specification yaml file and it creates the documentation pages needed for our APIs. This process streamlines our documentation while also preventing us from locking ourselves to a single vendor. If, down the road, we need to move to another API gateway solution, we will be able to re-use our OpenAPI Specification yaml files to create and document our APIs.

OpenAPI specification has been quick for our team to learn. Our students are able to pick it up after a few hours. Starting from a sample file it is easy to modify it to document new APIs. Once a person has experience with OpenAPI Specification, in less than 30 minutes we can have a design document that we can share with developers for feedback. This enables us to develop APIs faster and keep our developers happy. Faster development and happy developers? That’s a win.

Posted in API.

Why should you spend your time reading another blog?  Well, if you’re reading this post you’re probably part of, or want to be part of, the OSU developer community. So are we. And we want to grow  our community. To that end, this blog is dedicated to providing useful information to developers on campus.

Developers at OSU come in all sizes and shapes: backend, frontend, Drupal, WordPress, framework and people wanting to learn. They’re working on a variety of languages, frameworks and tools. Given the variety of developers, we plan to provide fresh blog posts at least once a week and subjects will vary depending on the developer author.

These posts aren’t meant to exist in the ether – we hope they’ll spark conversations and help grow our skills while growing our community. In other words, comment and discuss (but keep it civil). And if there’s anything particular you’d like to know about or a post you’d like to write, let us know in the comments below.