Gitlab Community Edition Instance

Commit 1456bc1c authored by mhellka's avatar mhellka
Browse files

Added a way to request additional information from a search backend.

# Conflicts:
#	apidoc/src/api.yaml
#	cdstar-plugins-pom/cdstar-proxy-search/src/test/java/de/gwdg/cdstar/ext/proxysearch/ProxySearchProviderTest.java
#	cdstar-rest/src/main/java/de/gwdg/cdstar/rest/v3/async/SearchHandler.java
#	cdstar-web-common/src/main/java/de/gwdg/cdstar/web/common/model/SearchHits.java
parent 3d3ae40c
Pipeline #291590 passed with stages
in 6 minutes and 34 seconds
......@@ -97,6 +97,12 @@ endpoints:
This is useful if the realm of the user does not return all groups the user belongs to,
and some search hits are not visible because of that. Each claim is checked against the
realm, and if successful, hits visible to that group are included in the result.
fields:
type: list
help: |
Request additional information from search to be returnd with the result set or with each hit.
Valid values depend on the search backend, but most backends should at least support requesting
index field values by name.
results:
- code: 200
......@@ -495,7 +501,6 @@ endpoints:
See <<resumeFile>> for details. Conflicting operations, for example reading the
file content or fetching its info, will fail until the file was completely updated
or removed. `HEAD` requests to the files URL are allowed, though.
consumes:
- '+*/*+'
produces:
......@@ -915,7 +920,10 @@ data:
help: Full file name (including path) of the matched file. Only present if `type` equals `file`.
score:
type: float
help: Relevance score. May be 0 for queries or search backends that do not support relevance scoring.
help: Relevance score. Higher values indicate a more relevant hit.
fields:
type: object
help: Additional information for this hit, usually triggered by a query which requested additional fields.
example: &SearchHitExample
id: ab587f42c2570a884
type: file
......@@ -930,7 +938,7 @@ data:
help: Number of results in this page.
total:
type: int
help: Total number of results in this result set (approximation)
help: Total number of results in this result set (approximation), or -1 if unknown.
scroll:
help: |
A stateless cursor representing the last hit of this result page.
......@@ -938,6 +946,9 @@ data:
hits:
type: list(<<SearchHit>>)
help: List of search hits
fields:
type: object
help: Additional information for this result set.
example:
count: 1
total: 1
......
......@@ -11,12 +11,12 @@ To simplify gateway development and improve security, client credentials are NOT
[source,yaml]
----
plugin:
search:
class: ProxySearchPlugin
target: "https://gateway.example.com/search"
maxconn: 4
header:
X-Custom-Header: value
search:
class: ProxySearchPlugin
target: "https://gateway.example.com/search"
maxconn: 4
header:
X-Custom-Header: value
----
.Config Parameters
......@@ -44,6 +44,7 @@ The search gateway should accept POST requests at the configured target URL with
| order | array(string) | User provided order criteria as a list of field names to order by, each optionally prefixed with `-`. (optional)
| limit | int | User provided limit for results per page. (optional)
| scroll | string | User provided scroll handle. (optional)
| fields | array(string) | User provided additional result fields to return for each hit. (optional)
| vault | string | Name of the vault this search is performed on.
| principal | object | Security context for this search request. If missing or None, assume an unauthenticated user.
| principal.name | string | Name (including domain) of the user performing the search. (optional)
......@@ -51,7 +52,7 @@ The search gateway should accept POST requests at the configured target URL with
| principal.privileged | boolean | If true, assume the user can see all results. (default: false)
|=========
The `q`, `order`, `limit` and `scroll` fields correspond to the (cleaned up) user provided search parameters as defined by the CDSTAR search API. `vault` and `principal` are added by CDSTAR. The search target should limit search results to entities visible to the specified `principal`. If no principal is present (null, missing or empty), the search should only return publicly visible results. If `principal.privileged` is true, the search should not filter by visibility and return all matching results.
The fields `q`, `order`, `limit`, `scroll` and `fields` correspond to the (cleaned up) user provided search parameters as defined by the CDSTAR search API. `vault` and `principal` are added by CDSTAR. The search target should limit search results to entities visible to the specified `principal`. If no principal is present (null, missing or empty), the search should only return publicly visible results. If `principal.privileged` is true, the search should not filter by visibility and return all matching results.
.Example Request
[source,json]
......@@ -74,7 +75,7 @@ Content-Type: application/json
== Security considerations
Since the search gateway is not supposed to authenticate the searching user and trust the fields send by CDSTAR, it could be used to perform searches on behalf of another user, if accessed directly by an attacker. Make sure that the gateway is only reachable from the CDSTAR instance, protected by HTTPS and some authentication mechanism (e.g. BASIC auth or additional headers).
Since the search gateway is not supposed to authenticate the searching user and trust the fields send by CDSTAR, it could be used to perform searches on behalf of another user, if accessed directly by an attacker. Make sure that the gateway is only reachable from the CDSTAR instance, or protected by HTTPS and some authentication mechanism (e.g. BASIC auth or additional headers).
......
......@@ -10,6 +10,7 @@ public class JsonQuery {
public List<String> order;
public int limit;
public String scroll;
public List<String> fields;
public String vault;
public PrincipalInfo principal = new PrincipalInfo();
......
......@@ -163,6 +163,7 @@ class ProxySearchProvider implements SearchProvider {
queryDoc.limit = q.getLimit();
if (Utils.notNullOrEmpty(q.getScrollId()))
queryDoc.scroll = q.getScrollId();
queryDoc.fields = q.getFields();
// Trusted parameters
queryDoc.vault = q.getVault();
......
package de.gwdg.cdstar.ext.proxysearch;
import java.util.List;
import java.util.Map;
import de.gwdg.cdstar.Utils;
import de.gwdg.cdstar.runtime.search.SearchHit;
......@@ -24,6 +25,11 @@ class SearchResultWrapper implements SearchResult {
return hits;
}
@Override
public Map<String, Object> getFields() {
return proxyResult.getFields();
}
@Override
public String getScrollID() {
return proxyResult.getScroll();
......@@ -61,6 +67,11 @@ class SearchResultWrapper implements SearchResult {
return proxyHit.name;
}
@Override
public Map<String, Object> getFields() {
return proxyHit.fields;
}
}
}
\ No newline at end of file
......@@ -31,7 +31,7 @@ public class ProxySearchProviderTest {
@Test
public void testPrincipalResolution() throws Exception {
final SearchQuery sq = SearchQuery.builder()
.query("q").principal("testP").group("g1").group("g2").build();
.query("q").principal("testP").groups("g1", "g2").build();
final JsonQuery jq = psp.buildQueryJson(sq);
assertEquals("q", jq.q);
......@@ -42,8 +42,8 @@ public class ProxySearchProviderTest {
@Test
public void testAllParameters() throws Exception {
final SearchQuery sq = SearchQuery.builder()
.query("q").principal("p").group("g1").group("g2").vault("v").limit(12).order("a", "b").scrollId("s")
.build();
.query("q").principal("p").groups("g1", "g2").vault("v").limit(12).order("a", "b").scrollId("s")
.build();
final JsonQuery jq = psp.buildQueryJson(sq);
assertEquals("q", jq.q);
assertEquals("v", jq.vault);
......
......@@ -28,8 +28,7 @@ public class SearchProviderTest {
final SearchQuery q = SearchQuery.builder()
.vault("testVault")
.principal("test@test")
.group("testGroup")
.group("group2")
.groups("testGroup", "group2")
.limit(13)
.order("id", "-score", "name DESC")
.query("dc.title:\"Cat pictures\"")
......
package de.gwdg.cdstar.rest.v3.async;
import java.util.ArrayList;
import java.util.List;
import de.gwdg.cdstar.Promise;
......@@ -48,15 +49,20 @@ public class SearchHandler {
query.vault(ctx.getPathParam("vault"));
query.query(qh.get(PARAM_Q));
query.limit(Utils.gate(0, qh.getInt("limit", 0, Integer.MAX_VALUE), maxResults));
query.order(qh.getAnyCsv("order").toArray(new String[] {}));
query.order(qh.getAnyCsv("order"));
query.fields(qh.getAnyCsv("fields"));
if (qh.has("scroll"))
query.scrollId(qh.get("scroll"));
for (final String group : qh.getAnyCsv("group")) {
List<String> groups = qh.getAnyCsv("group");
for (final String group : groups) {
// TODO: Qualify group names
if (!subject.isMemberOf(group))
throw new ErrorResponse(403, "InvalidGroupClaim",
"One of the claimed group memberships could not be verified.").detail("group", group);
query.group(group);
}
query.groups(groups);
if (!subject.isAnonymous())
query.principal(subject.getPrincipal().getFullId());
......@@ -111,7 +117,9 @@ public class SearchHandler {
}
return true;
} else if (token.startsWith("order:")) {
qb.order(token.substring(6).split(","));
List<String> newOrder = new ArrayList<String>(qb.getOrder());
newOrder.addAll(Utils.split(token.substring(6), ","));
qb.order(newOrder);
return true;
}
return false;
......@@ -127,6 +135,7 @@ public class SearchHandler {
hit.type = h.getType();
hit.score = h.getScore();
hit.name = h.getName();
hit.fields = h.getFields();
return hit;
});
return new SearchHits(items, rs.getTotal(), rs.getScrollID());
......
package de.gwdg.cdstar.runtime.search;
import java.util.Collections;
import java.util.Map;
public interface SearchHit {
/**
* Archive ID of this hit.
*/
String getId();
/**
* Either 'archive' or 'file'.
*/
String getType();
/**
......@@ -11,6 +20,20 @@ public interface SearchHit {
*/
String getName();
/**
* Hit relevance score. An arbitrary number computed by the backing search. A
* high value means high relevance.
*/
double getScore();
/**
* A map of additional field values requested during search. The answer to a
* {@link SearchQuery#getFields()} request should have the same key name as the
* request, or a name derived from it in a predictable but provider-specific
* way. The value can be anything, depending on the requested info.
*/
default Map<String, Object> getFields() {
return Collections.emptyMap();
}
}
......@@ -3,23 +3,68 @@ package de.gwdg.cdstar.runtime.search;
import java.util.List;
import java.util.Set;
/**
* An unmodifiable search request that can be sent to a {@link SearchProvider}.
*/
public interface SearchQuery {
/**
* A query string. Syntax depends on the implementing {@link SearchProvider}.
*/
String getQuery();
/**
* A list of fields to include in the result for each hit. A
* {@link SearchProvider} may also support more complex requests (e.g. computed
* fields or aggregates).
*/
List<String> getFields();
/**
* The principal name (full ID) of the searching user, or null if this is an
* anonymous search.
*/
String getPrincipal();
/**
* A set of groups the searching principal is member of.
*/
Set<String> getGroups();
/**
* The vault that is searched in.
*/
String getVault();
/**
* A list if field names (or expressions) to order by. Each value may be
* prefixed by a `-` to reverse the ordering on this field. An empty list should
* result in default ordering (by score or relevance).
*/
List<String> getOrder();
/**
* A scroll ID as specified by a previous search.
*/
String getScrollId();
/**
* The maximum number of hits to return per query. A {@link SearchProvider} may
* choose to return less.
*/
int getLimit();
/**
* Construct a builder for a new search query.
*/
static SearchQueryBuilder builder() {
return new SearchQueryBuilder();
}
/**
* Construct a builder for an existing search query.
*/
static SearchQueryBuilder builder(SearchQuery makeCopy) {
return SearchQueryBuilder.copyOf(makeCopy);
}
}
package de.gwdg.cdstar.runtime.search;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.Collections;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import de.gwdg.cdstar.Utils;
public class SearchQueryBuilder implements SearchQuery {
private final Set<String> groups = new HashSet<>();
private List<String> fields = new ArrayList<>();
private String principal;
private String query;
private final List<String> order = new ArrayList<>();
......@@ -16,16 +21,51 @@ public class SearchQueryBuilder implements SearchQuery {
private String vault;
private int limit;
public SearchQueryBuilder group(String group) {
groups.add(group);
/**
* Copy all values form an existing {@link SearchQuery} into a new builder.
*/
static SearchQueryBuilder copyOf(SearchQuery query) {
final SearchQueryBuilder copy = new SearchQueryBuilder();
copy.groups.addAll(query.getGroups());
copy.fields.addAll(query.getFields());
copy.order.addAll(query.getOrder());
copy.principal = query.getPrincipal();
copy.query = query.getQuery();
copy.limit = query.getLimit();
copy.scrollId = query.getScrollId();
copy.vault = query.getVault();
return copy;
}
/**
* Claim to be a member of one or more user groups. The group membership is
* validated, and then forwarded to the {@link SearchProvider} to allow search
* in non-public records.
*/
public SearchQueryBuilder groups(Collection<String> groups) {
this.groups.clear();
if (Utils.notNullOrEmpty(groups))
this.groups.addAll(groups);
return this;
}
/**
* See {@link #groups(Collection)}
*/
public SearchQueryBuilder groups(String... groups) {
return this.groups(Arrays.asList(groups));
}
@Override
public Set<String> getGroups() {
return Collections.unmodifiableSet(groups);
}
/**
* Claim to be this principal. The correctness is validated, and the value is
* then forwarded to the {@link SearchProvider} to allow search in non-public
* records.
*/
public SearchQueryBuilder principal(String principal) {
this.principal = principal;
return this;
......@@ -36,6 +76,9 @@ public class SearchQueryBuilder implements SearchQuery {
return principal;
}
/**
* Set the query string for this search. This is required.
*/
public SearchQueryBuilder query(String query) {
this.query = query;
return this;
......@@ -46,15 +89,25 @@ public class SearchQueryBuilder implements SearchQuery {
return query;
}
public SearchQueryBuilder resetOrder() {
order.clear();
/**
* Set the field names to order results by. Values may be prefixed with `-` to
* signal reversed ordering for that field. A {@link SearchProvider} may support
* more complex ordering (e.g. computed fields) so the values do not necessary
* have to correspond to actual index field names. An empty collection will
* result in default ordering (usually by score).
*/
public SearchQueryBuilder order(Collection<String> order) {
this.order.clear();
if (Utils.notNullOrEmpty(order))
this.order.addAll(order);
return this;
}
/**
* See {@link #order(Collection)}
*/
public SearchQueryBuilder order(String... order) {
for (final String o : order)
this.order.add(o);
return this;
return order(Arrays.asList(order));
}
@Override
......@@ -62,6 +115,9 @@ public class SearchQueryBuilder implements SearchQuery {
return Collections.unmodifiableList(order);
}
/**
* Set a scroll ID to continue a previous search.
*/
public SearchQueryBuilder scrollId(String scrollId) {
this.scrollId = scrollId;
return this;
......@@ -72,6 +128,35 @@ public class SearchQueryBuilder implements SearchQuery {
return scrollId;
}
/**
* Request additional information to be returned as part of the result set for
* each hit. At the very least, a {@link SearchProvider} should accept requests
* for index fields by name, and return their values as a search hit properties.
* Support for more sophisticated field requests (e.g. computations or
* aggregates) is optional.
*/
public SearchQueryBuilder fields(Collection<String> fields) {
this.fields.clear();
if (Utils.notNullOrEmpty(fields))
this.fields.addAll(fields);
return this;
}
/**
* See {@link #fields(Collection)}
*/
public SearchQueryBuilder fields(String... fields) {
return fields(Arrays.asList(fields));
}
@Override
public List<String> getFields() {
return Collections.unmodifiableList(fields);
}
/**
* Set the vault to search in.
*/
public SearchQueryBuilder vault(String vault) {
this.vault = vault;
return this;
......@@ -82,6 +167,9 @@ public class SearchQueryBuilder implements SearchQuery {
return vault;
}
/**
* Set a maximum number of hits to be returned per result.
*/
public SearchQueryBuilder limit(int limit) {
this.limit = limit;
return this;
......@@ -96,15 +184,7 @@ public class SearchQueryBuilder implements SearchQuery {
* Returns a {@link SearchQuery} based on the current state of the builder.
*/
public SearchQuery build() {
final SearchQueryBuilder copy = new SearchQueryBuilder();
copy.groups.addAll(groups);
copy.limit = limit;
copy.order.addAll(order);
copy.principal = principal;
copy.query = query;
copy.scrollId = scrollId;
copy.vault = vault;
return copy;
return copyOf(this);
}
}
package de.gwdg.cdstar.runtime.search;
import java.util.Collections;
import java.util.List;
import java.util.Map;
public interface SearchResult {
/**
* Number of hits in this results.
*/
default int getSize() {
return hits().size();
}
/**
* List of result hits.
*/
List<SearchHit> hits();
/**
* Return a string that can be used to fetch the next page of the current
* search result.
* Return a string that can be used to fetch the next page of the current search
* result.
*/
String getScrollID();
/**
* Number of total results, or -1 if unknown.
*/
long getTotal();
/**
* A map of additional field values requested during search. The answer to a
* {@link SearchQuery#getFields()} request should have the same key name as the
* request, or a name derived from it in a predictable but provider-specific
* way. The value can be anything, depending on the requested info.
*/
default Map<String, Object> getFields(){
return Collections.emptyMap();
}
}
package de.gwdg.cdstar.web.common.model;
import java.util.List;
import java.util.Map;
import com.fasterxml.jackson.annotation.JsonInclude;
import com.fasterxml.jackson.annotation.JsonInclude.Include;
......@@ -10,6 +11,7 @@ import de.gwdg.cdstar.web.common.model.SearchHits.Hit;
public class SearchHits extends AbstractList<Hit> {
private String scroll;
public Map<String, Object> fields;
public SearchHits() {
super();
......@@ -30,6 +32,11 @@ public class SearchHits extends AbstractList<Hit> {
return scroll;
}
@JsonInclude(Include.NON_EMPTY)
public Map<String, Object> getFields() {
return fields;
}
public void setScroll(String scroll) {